| « Replacing Characters In A String (Updated) | Factor Package Manager Preview » |
An Overview Of Package Management
I am busy working on a package manager for Factor. This post is about the general process of package management.
Mainly there are 2 main parts to package management:
- The local Factor installation.
- The remote package repository.
The following schematic depicts the parts of the Factor package management system:
+--------------------------------------------------+
| Remote Package Repository |
| |
| +------------+ +-----------------------------+ |
| | Repository | | Package Files | |
| | Index | | | |
| +------------+ +-----------------------------+ |
| |
+--------------------------------------------------+
|
|
+--------------------------------------------------+
| Local Factor Installation |
| |
| +-------+ +---------+ +------------------+ |
| | index | | package | | Factor files | |
| | cache | | cache | | | |
| +-------+ +---------+ | | |
| | | | |
| +----------+ | | |
| | work dir | --> | | |
| +----------+ +------------------+ |
| |
+--------------------------------------------------+
The Remote Package Repository
The remote package repo has 2 main components:
- the repository index
- the package files (tarballs)
The repository index is a collection of files that store information about each
package in the repository, such as the name, version, dependencies of each package, and physical location of package file.
The repository index is split into multiple files to make synchronizing the indexes more efficient. The package manager uses the repository index to obtain information about the packages-to-be-installed and to download the necessary packages. Thus, the package manager's view of the remote repository is always what it sees in the index file. This means that a package repository's index must be updated as new packages are added to the repository. The Factor package manager will support words for creating and maintaining a package repository, so much of these tasks will be automated
.
The other component of the package repository is the actual package files. These are normal .tar.bz2 files, but they have the extension .fpkg. A factor package has a specific layout and requires certain files to be present, so I decided to change the file extension to .fpkg to make it clear that it is not just a plain .tar.bz2 tarball.
The files are organized into directories as the package repository administrator wishes.
There is still much to be said about about package repository layout and partitioning, but I will cover this in a following post.
For the first implementation, the package manager can only retireve files via HTTP, but FTP support is also planned.
The Local Factor Installation
The Factor package manager's main function is to install the requested package (from a remote package repository) into the local Factor installation so the package is available
for use (automatically sorting out the package dependencies and installing all necessary dependency packages as well).
The package manager (used locally) has the following components:
- Index Cache
- Package Cache
- Work Directory
- Factor Files
The package manager caches each remote repository's index files into the local index cache. Upon each connection to a remote package repository, the package index files must be synchronized between the local index cache and the remote repo. The remote package repository's index is cached locally because a repo is a relatively static entity with low volume changes over time. Thus it is more efficient to download parts of the remote package repo's index that have been updated and scan through the index on the local machine. This operation does not usually require manual user intervention and is usually transparent to the user.
The package manager also provides a package cache for storing .fpkg files. Caching is also for efficiency reasons. It allows the user to download-once-install-many-times, without the need to re-download the same package files each time. This makes installing and removing the same packages very easy and allows for experimentation and quick revert if something goes wrong. Caching behavior can be changed if you don't want to cache packages because of low disk space, or other reasons.
The work directory is a directory for temporarily extracting package files and working with the files in a package. This directory is usually empty and only used when you are installing / updating packages. we don't want any extracted package files lying around that just wastes space
.
The package manager knows about, and has access to the local Factor installation files. After a package has been downloaded, verified, and extracted, it is moved to the local Factor vocab search path so the package is available for use in Factor immediately after installation.
Dependency Resolution
The package manager's most complex part is the package dependency resolution mechanism. This allows the package manager to automatically resolve the dependencies for the packages-to-be-installed and makes sure that any package that is installed has the necessary libraries and files available for it to work correctly.
Reliability
The package manager is designed with reliability in mind as 1 of the primary goals. The following mechanisms are used to ensure that packages are installed in a reliable manner:
- Package Checksums
- Package Signing
- Transactions
Each package contains a checksum.md5 or checksum.sha256 file. This file stores MD5/SHA256 hashes of the package files, which is used by the package manager to make sure that a package was downloaded successfully and that the package files are intact. This package verification process happens automatically and is transparent to the user.
Each package can be signed and checked for authenticity upon download. This mechanism will be implemented in a later phase of the package manager, but it is planned.
The package manager will also be able to deal with situations such as where a connection to the remore package repository is lost, or if a power failure occurs. Using transactions, the package manager will first record the actions that need to be performed in a file and then perform the actions as dictated by the transaction. In the event of a power failure or loss of connection, the package manager can resume / restart the interrupted transaction and also detect corrupt transactions and try to restart them.
The package manager is designed to complete operations in stages with transactions, This means a transaction is further sub-divided into stages, and if a stage fails, only the stage needs to be resumed / restarted. Transactions are also planned for a later release of the package manager and will not be available in the first release. Even without the transaction mechanism, the package manager already uses phases to increase reliability. The package manager should never leave your Factor installation in an unstable / unknown state.
Ease Of Use (A Demonstration)
The package manager's user-interface (not to be confused with GUI) should be intuitive and easy to use.
The following examples illustrate the interface for using the package manager:
First, you open the Factor UI, then make sure the packages vocabulary is loaded by typing:
USE: packages ;
Now we can start using the package manager words to manage our locally installed vocabularies. The following example illustrates the use of the package management words to perform some package management.
In this scenario, we want to install package Foo4. Package Foo4 depends on the following packages:
- Foo1 - which is laready installed in our local Factor installation.
- Foo2 - which is cached in the local package cache, but not yet installed.
- Foo3 - which is not cached/installed locally.
To install package Foo4, you type:
"foo4" pkg-install
After issuing the pkg-install command, the package manager will see that package Foo1 is already installed, package Foo2 is cached and it should be installed, but not downloaded, and package Foo3 must be downloaded from a remote repository, and installed. Package Foo4 is also not installed/cached, so it will be downloaded as well. The package manager will then attempt to download, verify, and install the required package and all its dependency packages as well.
To update an already-installed package, you type:
"foo4" pkg-update
The pkg-update word sees that we didn't specify a version and automatically searches for the latest version in all remote package repositories. If a newer version of Foo4 is found thah the currently installed one, the existing Foo4 package will be updated to the newer version.
To remove a package, you type:
"foo4" pkg-uninstall
This will remove the Foo4 package and the user will be prompted to remove each dependency or keep the dependency packages.
This is a demonstration of how the package manager will be used by users to manage their packages. There are a lot of complexities to manage in writing a package manager and it is no easy task.
Open Issues
1 issue that is still unresolved is if and how the package manager and Factor is going to support installation of different versions of the same package.
Release Plan
Note that this is a preliminary release plan and will most probably change
.
The 1st release will only contain the basic functionality for managing a remote package repo9sitory and installing, updating, and uninstalling packages from the remote package repositories.
The 2nd release will include automatic dependency resolution
The 3rd release will implement the transaction mechanism
The 4th release will include package signing
The 5th release will add FTP support