Category: Factor
Attack Of The Clone
Hi everyone, it has been a bit crazy over here so I didn't have time to blog for a while
.
This article will cover the word clone and its uses and why it exists. clone is used very often and it is good to know what it does.
eq?
Before we start, I want to introduce you to another Factor word called eq?. This word checks if 2 elements on the stack point to the same data (if the stack elements reference the same object). In Factor, when we say:
"123" dup eq?
the result is t, which indicates that dup makes a copy of the reference to the "123" object, so on the stack you have 2 references which point to the string "123".
Data Allocation In Words
In Factor, a word allocates storage for its data and always returns the same reference to the data. For example:
: foo ( -- x )
V{ } ;
allocates storage for the empty vector, and when foo is called, it will always refer to that vector instance. To test this, here is a little code example:
3 foo push ! add a new element to the vector defined in word foo foo . ! print the contents of the vector in foo
The above code prints V{ 3 }. So why is this important? Because if you don't know about the effects of defining data inside words, it can lead to some bugs that are hard to solve.
Introducing clone
Ok, now how do i get a new instance of V{ } every time I call foo? We use the clone word. So we can change our
foo word as follows:
: foo ( -- x )
V{ } clone ;
now if we execute the following code again:
3 foo push ! add a new element to the vector defined in word foo foo . ! print the contents of the vector in foo
it prints V{ } instead of V{ 3 }, because clone ensures that we always get a new, fresh copy of the vector defined in foo. clone creates a new instance of an object only if the object being cloned is a mutable object (its value can be changed). If the object being cloned is immutable (its value cannot be changed) then clone does nothing.
Factor Fun (part 2)
This article relates to the Factor Fun (part 2) post. The code in it suffers from not using clone in the right places. I have also changed the code a bit because there were some improvements that could be made (thanks slava, erg). So the updated code looks like this:
USING: new-slots accessors math io prettyprint combinators.cleave ;
! we are using NEW-SLOTS for this program because they will be replacing the old-style
! slots
! create a tuple to hold information for each room
TUPLE: room name start-num end-num files ;
C: <room> room
: show-room-info ( tuple -- )
[ files>> ] [ name>> ] bi
"Files for room [" swap "]:" 3append print
[ . ] each ;
: sort-files ( seq -- )
! seq is a sequence of file numbers that need to be put back in their place.
! This word then calculates what files go to what room and displays the output.
! ---
V{ } clone
"A" 1 20000 V{ } clone <room> add
"B" 20001 40000 V{ } clone <room> add
"C" 40001 70000 V{ } clone <room> add
[
over
[
dup pick [ start-num>> ] [ end-num>> ] bi between?
[ over files>> push ] [ drop ] if
] each
show-room-info
] each drop ;
Notice the clone words that were added. Also I used cleave combinators to remove some stack shuffling and clean up the code a bit. This code could still be made clearer by using curry, but i'm still figuring that out
.
For more info, check out the Factor documentation for Vectors and the clone word.
Replacing Characters In A String (Updated)
While I am working on the Factor package manager, I came across an interesting problem:
"how to replace certain characters with other characters in a string".
I have been looking at a few solutions. The first solution that I came up with was:
: url>path ( URL -- path )
"" swap
[
1string
{
{ ":" [ "-" ] }
{ "/" [ "-" ] }
[ "%" [ "-" ] }
[ "~" [ "-" ] }
[ "@" [ "-" ] }
[ ]
} case append
] each ;
After some chatting on IRC (thanks tizoc) I changed the solution to:
: url>path ( URL -- path )
[
{
{ CHAR: : [ CHAR: - ] }
{ CHAR: / [ CHAR: - ] }
{ CHAR: % [ CHAR: - ] }
{ CHAR: ~ [ CHAR: - ] }
{ CHAR: @ [ CHAR: - ] }
[ ]
} case
] map ;
In both solutions, I am transforming specific characters to 1 character, but this need not be the case. The solution im looking for will need to support transforming to multiple characters if needed.
To use the above word you would type:
"abc:/%" url>path
=> "abc---"
After talking to erg on #concatenative, I changed the code to use the
replace word in the sequences.lib vocabulary:
: url>path ( URL -- path )
":/%~@" "-----" replace ;
An Overview Of Package Management
I am busy working on a package manager for Factor. This post is about the general process of package management.
Mainly there are 2 main parts to package management:
- The local Factor installation.
- The remote package repository.
The following schematic depicts the parts of the Factor package management system:
+--------------------------------------------------+
| Remote Package Repository |
| |
| +------------+ +-----------------------------+ |
| | Repository | | Package Files | |
| | Index | | | |
| +------------+ +-----------------------------+ |
| |
+--------------------------------------------------+
|
|
+--------------------------------------------------+
| Local Factor Installation |
| |
| +-------+ +---------+ +------------------+ |
| | index | | package | | Factor files | |
| | cache | | cache | | | |
| +-------+ +---------+ | | |
| | | | |
| +----------+ | | |
| | work dir | --> | | |
| +----------+ +------------------+ |
| |
+--------------------------------------------------+
The Remote Package Repository
The remote package repo has 2 main components:
- the repository index
- the package files (tarballs)
The repository index is a collection of files that store information about each
package in the repository, such as the name, version, dependencies of each package, and physical location of package file.
The repository index is split into multiple files to make synchronizing the indexes more efficient. The package manager uses the repository index to obtain information about the packages-to-be-installed and to download the necessary packages. Thus, the package manager's view of the remote repository is always what it sees in the index file. This means that a package repository's index must be updated as new packages are added to the repository. The Factor package manager will support words for creating and maintaining a package repository, so much of these tasks will be automated
.
The other component of the package repository is the actual package files. These are normal .tar.bz2 files, but they have the extension .fpkg. A factor package has a specific layout and requires certain files to be present, so I decided to change the file extension to .fpkg to make it clear that it is not just a plain .tar.bz2 tarball.
The files are organized into directories as the package repository administrator wishes.
There is still much to be said about about package repository layout and partitioning, but I will cover this in a following post.
For the first implementation, the package manager can only retireve files via HTTP, but FTP support is also planned.
The Local Factor Installation
The Factor package manager's main function is to install the requested package (from a remote package repository) into the local Factor installation so the package is available
for use (automatically sorting out the package dependencies and installing all necessary dependency packages as well).
The package manager (used locally) has the following components:
- Index Cache
- Package Cache
- Work Directory
- Factor Files
The package manager caches each remote repository's index files into the local index cache. Upon each connection to a remote package repository, the package index files must be synchronized between the local index cache and the remote repo. The remote package repository's index is cached locally because a repo is a relatively static entity with low volume changes over time. Thus it is more efficient to download parts of the remote package repo's index that have been updated and scan through the index on the local machine. This operation does not usually require manual user intervention and is usually transparent to the user.
The package manager also provides a package cache for storing .fpkg files. Caching is also for efficiency reasons. It allows the user to download-once-install-many-times, without the need to re-download the same package files each time. This makes installing and removing the same packages very easy and allows for experimentation and quick revert if something goes wrong. Caching behavior can be changed if you don't want to cache packages because of low disk space, or other reasons.
The work directory is a directory for temporarily extracting package files and working with the files in a package. This directory is usually empty and only used when you are installing / updating packages. we don't want any extracted package files lying around that just wastes space
.
The package manager knows about, and has access to the local Factor installation files. After a package has been downloaded, verified, and extracted, it is moved to the local Factor vocab search path so the package is available for use in Factor immediately after installation.
Dependency Resolution
The package manager's most complex part is the package dependency resolution mechanism. This allows the package manager to automatically resolve the dependencies for the packages-to-be-installed and makes sure that any package that is installed has the necessary libraries and files available for it to work correctly.
Reliability
The package manager is designed with reliability in mind as 1 of the primary goals. The following mechanisms are used to ensure that packages are installed in a reliable manner:
- Package Checksums
- Package Signing
- Transactions
Each package contains a checksum.md5 or checksum.sha256 file. This file stores MD5/SHA256 hashes of the package files, which is used by the package manager to make sure that a package was downloaded successfully and that the package files are intact. This package verification process happens automatically and is transparent to the user.
Each package can be signed and checked for authenticity upon download. This mechanism will be implemented in a later phase of the package manager, but it is planned.
The package manager will also be able to deal with situations such as where a connection to the remore package repository is lost, or if a power failure occurs. Using transactions, the package manager will first record the actions that need to be performed in a file and then perform the actions as dictated by the transaction. In the event of a power failure or loss of connection, the package manager can resume / restart the interrupted transaction and also detect corrupt transactions and try to restart them.
The package manager is designed to complete operations in stages with transactions, This means a transaction is further sub-divided into stages, and if a stage fails, only the stage needs to be resumed / restarted. Transactions are also planned for a later release of the package manager and will not be available in the first release. Even without the transaction mechanism, the package manager already uses phases to increase reliability. The package manager should never leave your Factor installation in an unstable / unknown state.
Ease Of Use (A Demonstration)
The package manager's user-interface (not to be confused with GUI) should be intuitive and easy to use.
The following examples illustrate the interface for using the package manager:
First, you open the Factor UI, then make sure the packages vocabulary is loaded by typing:
USE: packages ;
Now we can start using the package manager words to manage our locally installed vocabularies. The following example illustrates the use of the package management words to perform some package management.
In this scenario, we want to install package Foo4. Package Foo4 depends on the following packages:
- Foo1 - which is laready installed in our local Factor installation.
- Foo2 - which is cached in the local package cache, but not yet installed.
- Foo3 - which is not cached/installed locally.
To install package Foo4, you type:
"foo4" pkg-install
After issuing the pkg-install command, the package manager will see that package Foo1 is already installed, package Foo2 is cached and it should be installed, but not downloaded, and package Foo3 must be downloaded from a remote repository, and installed. Package Foo4 is also not installed/cached, so it will be downloaded as well. The package manager will then attempt to download, verify, and install the required package and all its dependency packages as well.
To update an already-installed package, you type:
"foo4" pkg-update
The pkg-update word sees that we didn't specify a version and automatically searches for the latest version in all remote package repositories. If a newer version of Foo4 is found thah the currently installed one, the existing Foo4 package will be updated to the newer version.
To remove a package, you type:
"foo4" pkg-uninstall
This will remove the Foo4 package and the user will be prompted to remove each dependency or keep the dependency packages.
This is a demonstration of how the package manager will be used by users to manage their packages. There are a lot of complexities to manage in writing a package manager and it is no easy task.
Open Issues
1 issue that is still unresolved is if and how the package manager and Factor is going to support installation of different versions of the same package.
Release Plan
Note that this is a preliminary release plan and will most probably change
.
The 1st release will only contain the basic functionality for managing a remote package repo9sitory and installing, updating, and uninstalling packages from the remote package repositories.
The 2nd release will include automatic dependency resolution
The 3rd release will implement the transaction mechanism
The 4th release will include package signing
The 5th release will add FTP support
Factor Package Manager Preview
Work has been keeping me quite busy lately, but I use my spare tie to write a package manager for Factor. This post provides an extremely early preview of the code so far. The following code is just for the
pkg-install command. Currently implemented is a really bare-bones pkg-install command that will be used to install a package. Any comments / suggestions are welcome
(as always).
USING: kernel http.client system sequences splitting io io.files io.launcher
crypto.md5 ;
IN: packages
<PRIVATE
: factor-base-dir ( -- path )
vm "/" last-split1 drop ;
: base-dir ( -- path )
! push the full path of the package/ dir onto the stack
factor-base-dir "/extra/packages" append ;
: pkg-download ( url -- ? )
! download a package from a remote repo (using the specified URL) to the
! local package cache
dup download-name base-dir "/data/pkg-cache/" rot 3append
download-to
t ;
: pkg-in-cache? ( name -- ? )
! check if a package with the specified name exists in the local package
! cache
base-dir "/data/pkg-cache" append directory
[ first over ".fpkg" append = ] contains? nip ;
: registry-to-lines ( -- seq )
! converts the pkg-registry file to a sequence of lines so it can be
! read / parsed
base-dir "/data/pkg-registry" append <file-reader> lines ;
: pkg-installed? ( name -- ? )
! check if a package is installed by reading the pkg-registry file
registry-to-lines [ over = ] contains? nip ;
: pkg-extract ( name -- ? )
! extracts the files from the package tarball into the data/tmp/ dir
cwd >r
"tar -xjf " base-dir "/data/pkg-cache/" 3append swap ".fpkg" 3append
base-dir "/data/tmp" append cd
run-process drop r> cd
t ;
: (pkg-valid?) ( path -- ? )
cd cwd directory
[
dup second
[ first (pkg-valid?) ".." cd ]
[ first cwd "/" rot 3append dup file>md5str " :: " rot 3append print ] if
] each ;
: pkg-valid? ( name -- ? )
! verify the integrity of the package and return the result
cwd swap base-dir "/data/tmp/" rot 3append (pkg-valid?) cd t ;
: (pkg-install) ( name -- )
! moves the extracted package files to the extra/ dir in the local Factor
! installation
dup base-dir "/data/tmp/" rot 3append dup
rot factor-base-dir "/extra/" rot 3append
copy-tree delete-tree ;
PRIVATE>
: pkg-install ( name -- )
dup pkg-installed? [
">> The package is already installed.\n" write
] [
! sync the local & remote index for the repo
! check the cached repo index for the latest version of the package
! check if that version is cached in the local package cache
! check dependencies of the package to install
! Prompt user if he would like to continue installation
"Would you like to continue the install? [y/n]: " write readln
"n" = [
">> Nothing was installed" print
] [
! download the required packages from the remote repo
">> downloading required packages" print
dup "http://zenhackers.net/files/" swap append ".fpkg" append pkg-download
drop ! drop result of pkg-download word, we should use this result to test for errors
! extract files from packages into their separate directories
">> extracting packages to temporary directory" print
dup pkg-extract drop
! verify integrity of downloaded packages using checksums
">> verifying integrity of packages" print
dup pkg-valid? [
! install the package in the extra/ directory
">> installing packages" print
(pkg-install)
">> install completed" print
] [
">> The package integrity check FAILED !!!" print
] if
] if
] if ;
Factor Package Manager
In my previous blog post I wrote code that implemented an algorithm. Not very useful for other people, but it was a starting point for me. So now I want to write something more useful that can actually contribute to Factor.
After a short chat on #concatenative, it seems there is no package manager for Factor. Something along the lines or ruby-gems.
So for my new Factor project, I will be writing a package manager for Factor. It should be a good learning experience and it should be useful
.
I have started designing the package manager and have put up some ideas on the
zen/-/acker WIKI.
Please take a look and leave comments on my blog or email me (rvdalen@yahoo.co.uk) about ideas / changes.
:: Next >>