Attack Of The Clone
Hi everyone, it has been a bit crazy over here so I didn't have time to blog for a while
.
This article will cover the word clone and its uses and why it exists. clone is used very often and it is good to know what it does.
eq?
Before we start, I want to introduce you to another Factor word called eq?. This word checks if 2 elements on the stack point to the same data (if the stack elements reference the same object). In Factor, when we say:
"123" dup eq?
the result is t, which indicates that dup makes a copy of the reference to the "123" object, so on the stack you have 2 references which point to the string "123".
Data Allocation In Words
In Factor, a word allocates storage for its data and always returns the same reference to the data. For example:
: foo ( -- x )
V{ } ;
allocates storage for the empty vector, and when foo is called, it will always refer to that vector instance. To test this, here is a little code example:
3 foo push ! add a new element to the vector defined in word foo foo . ! print the contents of the vector in foo
The above code prints V{ 3 }. So why is this important? Because if you don't know about the effects of defining data inside words, it can lead to some bugs that are hard to solve.
Introducing clone
Ok, now how do i get a new instance of V{ } every time I call foo? We use the clone word. So we can change our
foo word as follows:
: foo ( -- x )
V{ } clone ;
now if we execute the following code again:
3 foo push ! add a new element to the vector defined in word foo foo . ! print the contents of the vector in foo
it prints V{ } instead of V{ 3 }, because clone ensures that we always get a new, fresh copy of the vector defined in foo. clone creates a new instance of an object only if the object being cloned is a mutable object (its value can be changed). If the object being cloned is immutable (its value cannot be changed) then clone does nothing.
Factor Fun (part 2)
This article relates to the Factor Fun (part 2) post. The code in it suffers from not using clone in the right places. I have also changed the code a bit because there were some improvements that could be made (thanks slava, erg). So the updated code looks like this:
USING: new-slots accessors math io prettyprint combinators.cleave ;
! we are using NEW-SLOTS for this program because they will be replacing the old-style
! slots
! create a tuple to hold information for each room
TUPLE: room name start-num end-num files ;
C: <room> room
: show-room-info ( tuple -- )
[ files>> ] [ name>> ] bi
"Files for room [" swap "]:" 3append print
[ . ] each ;
: sort-files ( seq -- )
! seq is a sequence of file numbers that need to be put back in their place.
! This word then calculates what files go to what room and displays the output.
! ---
V{ } clone
"A" 1 20000 V{ } clone <room> add
"B" 20001 40000 V{ } clone <room> add
"C" 40001 70000 V{ } clone <room> add
[
over
[
dup pick [ start-num>> ] [ end-num>> ] bi between?
[ over files>> push ] [ drop ] if
] each
show-room-info
] each drop ;
Notice the clone words that were added. Also I used cleave combinators to remove some stack shuffling and clean up the code a bit. This code could still be made clearer by using curry, but i'm still figuring that out
.
For more info, check out the Factor documentation for Vectors and the clone word.
Factor Fun (Part 2)
It's time to have some fun with Factor and learn something in the process
. The purpose of this post is to demonstrate how to use tuples in Factor.
The Problem
My girlfriend works for a doctor as her personal assistant. (Yes, I have a REAL girlfriend
.) The doctor has 3 rooms filled with files. Each file has a unique number. Each room contains a specific sequence of file numbers:
- Room A : 1 - 20 000
- Room B : 20 001 - 40 000
- Room C : 40 001 - 70 000
Now, when the doctor sees a new patient, my g/f must physically fetch that patient's file from either room A, B, or C. But because she has a lot of other stuff to do there is no time to put back the file (and of coarse the doctor will not do it). So at the end of each week there is a pile of files that need to be put back in their respective rooms.
This isn't really a problem for her, but I thought it might make a good article that is simple to write and understand and explain a few concepts of Factor to the readers
.
Ok, so our program that we are going to write takes as input, a sequence of file numbers that need to be put away and basically sorts the files-to-put-away into a pile for each room.
The 1st Solution
USING: new-slots accessors math io prettyprint ;
! we are using NEW-SLOTS for this program because they will be replacing the old-style
! slots
! create a tuple to hold information for each room
TUPLE: room name start-num end-num files ;
C: <room> room
: show-room-info ( tuple -- )
dup files>> swap name>>
"Files for room [" swap "]:" 3append print
[ number>string print ] each ;
: sort-files ( seq -- )
! seq is a sequence of file numbers that need to be put back in their place.
! This word then calculates what files go to what room and displays the output.
! ---
V{ }
"A" 1 20000 V{ } <room> add
"B" 20001 40000 V{ } <room> add
"C" 40001 70000 V{ } <room> add
[
over
[
dup pick dup start-num>> swap end-num>> between?
[ over files>> swap add >>files ] [ drop ] if
] each
show-room-info
] each drop ;
In this solution things are a bit hard-coded and the abstraction is not fully complete. We need a way to define the "room" sequence and the files sequence and let the sort-files word do the rest. This will also allow us to specify any number of rooms that we might need. The idea, however, behind this post is to show how we use TUPLES: in Factor, not to show how to write complete abstractions
.
More About Tuples
To work with tuples, there are usually 2 steps :
- define the new tuple type
- create a constructor word for the tuple
To define a tuple, we write:
TUPLE: room name start-num end-num files ;
This defines a new tuple class called room with the following
slots:
- name
- start-num
- end-num
- files
To create an instance of the tuple class, we need to create a constructor word. Constructor words are used to initialize the slots of a tuple instance to known values. There are 2 general ways to do this:
- we can construct a new tuple with corresponding arguments on the stack.
- we can construct a new tuple with fixed slot values.
To create a constructor word that will take its initialization values from the stack, we write:
C: <room> room
or we can write:
: <room> room construct-boa ;
Both versions do exactly the same thing: it creates a constructor word that initializes the tuple instance's slot values with the corresponding values on the stack. The first version is shorter and cleaner to use if this is all that needs to be done when constructing a tuple instance. This type of constructor word is usually referred to as a "boa-constructor". (boa stands for: by order of arguments. This means that the slot values will be initialized in the order that the arguments are on the stack. E.g. slot1 will get argument 1 on stack, slot2 will get argument 2 on stack, etc.)
To create a constructor word that will initialize the tuple instance's slot values to fixed / pre-defined values we can create a constructor word like this:
: <room> room construct-empty "" >>name 0 >>start-num 0 >>end-num V{ } >>files ;
This constructor word creates a new room tuple instance and initializes the new instance's slots to known values. Using this method of construction, you can construct a tuple in any way imagineable (well, alomst
.
Constructor words have the following naming convention:
<constructor-word-name>
and should always be named like this if possible.
To get the value of a tuple slot, we use the new-slots vocab. This gives us access to the new way to work with tuple slots. To work with tuple slot, we type:
! create a new instance of the room tuple (using the boa-constructor):
"room A" 1 20000 V{ } <room>
! setting the 'name' tuple-slot's value:
"room B" >>name
! getting the value of the 'name' tuple-slot and printing it:
dup name>> print
Some Interesting Questions
-
What is the naming convention if you want 2 or more constructors for the same tuple? Maybe something like <room-foo> and <room-bar>. Any ideas are welcome.
-
Would it not be a good idea for the slot accessors to have a version that automatically duplicates the tuple and then retrieve the slot value?
Replacing Characters In A String (Updated)
While I am working on the Factor package manager, I came across an interesting problem:
"how to replace certain characters with other characters in a string".
I have been looking at a few solutions. The first solution that I came up with was:
: url>path ( URL -- path )
"" swap
[
1string
{
{ ":" [ "-" ] }
{ "/" [ "-" ] }
[ "%" [ "-" ] }
[ "~" [ "-" ] }
[ "@" [ "-" ] }
[ ]
} case append
] each ;
After some chatting on IRC (thanks tizoc) I changed the solution to:
: url>path ( URL -- path )
[
{
{ CHAR: : [ CHAR: - ] }
{ CHAR: / [ CHAR: - ] }
{ CHAR: % [ CHAR: - ] }
{ CHAR: ~ [ CHAR: - ] }
{ CHAR: @ [ CHAR: - ] }
[ ]
} case
] map ;
In both solutions, I am transforming specific characters to 1 character, but this need not be the case. The solution im looking for will need to support transforming to multiple characters if needed.
To use the above word you would type:
"abc:/%" url>path
=> "abc---"
After talking to erg on #concatenative, I changed the code to use the
replace word in the sequences.lib vocabulary:
: url>path ( URL -- path )
":/%~@" "-----" replace ;
An Overview Of Package Management
I am busy working on a package manager for Factor. This post is about the general process of package management.
Mainly there are 2 main parts to package management:
- The local Factor installation.
- The remote package repository.
The following schematic depicts the parts of the Factor package management system:
+--------------------------------------------------+
| Remote Package Repository |
| |
| +------------+ +-----------------------------+ |
| | Repository | | Package Files | |
| | Index | | | |
| +------------+ +-----------------------------+ |
| |
+--------------------------------------------------+
|
|
+--------------------------------------------------+
| Local Factor Installation |
| |
| +-------+ +---------+ +------------------+ |
| | index | | package | | Factor files | |
| | cache | | cache | | | |
| +-------+ +---------+ | | |
| | | | |
| +----------+ | | |
| | work dir | --> | | |
| +----------+ +------------------+ |
| |
+--------------------------------------------------+
The Remote Package Repository
The remote package repo has 2 main components:
- the repository index
- the package files (tarballs)
The repository index is a collection of files that store information about each
package in the repository, such as the name, version, dependencies of each package, and physical location of package file.
The repository index is split into multiple files to make synchronizing the indexes more efficient. The package manager uses the repository index to obtain information about the packages-to-be-installed and to download the necessary packages. Thus, the package manager's view of the remote repository is always what it sees in the index file. This means that a package repository's index must be updated as new packages are added to the repository. The Factor package manager will support words for creating and maintaining a package repository, so much of these tasks will be automated
.
The other component of the package repository is the actual package files. These are normal .tar.bz2 files, but they have the extension .fpkg. A factor package has a specific layout and requires certain files to be present, so I decided to change the file extension to .fpkg to make it clear that it is not just a plain .tar.bz2 tarball.
The files are organized into directories as the package repository administrator wishes.
There is still much to be said about about package repository layout and partitioning, but I will cover this in a following post.
For the first implementation, the package manager can only retireve files via HTTP, but FTP support is also planned.
The Local Factor Installation
The Factor package manager's main function is to install the requested package (from a remote package repository) into the local Factor installation so the package is available
for use (automatically sorting out the package dependencies and installing all necessary dependency packages as well).
The package manager (used locally) has the following components:
- Index Cache
- Package Cache
- Work Directory
- Factor Files
The package manager caches each remote repository's index files into the local index cache. Upon each connection to a remote package repository, the package index files must be synchronized between the local index cache and the remote repo. The remote package repository's index is cached locally because a repo is a relatively static entity with low volume changes over time. Thus it is more efficient to download parts of the remote package repo's index that have been updated and scan through the index on the local machine. This operation does not usually require manual user intervention and is usually transparent to the user.
The package manager also provides a package cache for storing .fpkg files. Caching is also for efficiency reasons. It allows the user to download-once-install-many-times, without the need to re-download the same package files each time. This makes installing and removing the same packages very easy and allows for experimentation and quick revert if something goes wrong. Caching behavior can be changed if you don't want to cache packages because of low disk space, or other reasons.
The work directory is a directory for temporarily extracting package files and working with the files in a package. This directory is usually empty and only used when you are installing / updating packages. we don't want any extracted package files lying around that just wastes space
.
The package manager knows about, and has access to the local Factor installation files. After a package has been downloaded, verified, and extracted, it is moved to the local Factor vocab search path so the package is available for use in Factor immediately after installation.
Dependency Resolution
The package manager's most complex part is the package dependency resolution mechanism. This allows the package manager to automatically resolve the dependencies for the packages-to-be-installed and makes sure that any package that is installed has the necessary libraries and files available for it to work correctly.
Reliability
The package manager is designed with reliability in mind as 1 of the primary goals. The following mechanisms are used to ensure that packages are installed in a reliable manner:
- Package Checksums
- Package Signing
- Transactions
Each package contains a checksum.md5 or checksum.sha256 file. This file stores MD5/SHA256 hashes of the package files, which is used by the package manager to make sure that a package was downloaded successfully and that the package files are intact. This package verification process happens automatically and is transparent to the user.
Each package can be signed and checked for authenticity upon download. This mechanism will be implemented in a later phase of the package manager, but it is planned.
The package manager will also be able to deal with situations such as where a connection to the remore package repository is lost, or if a power failure occurs. Using transactions, the package manager will first record the actions that need to be performed in a file and then perform the actions as dictated by the transaction. In the event of a power failure or loss of connection, the package manager can resume / restart the interrupted transaction and also detect corrupt transactions and try to restart them.
The package manager is designed to complete operations in stages with transactions, This means a transaction is further sub-divided into stages, and if a stage fails, only the stage needs to be resumed / restarted. Transactions are also planned for a later release of the package manager and will not be available in the first release. Even without the transaction mechanism, the package manager already uses phases to increase reliability. The package manager should never leave your Factor installation in an unstable / unknown state.
Ease Of Use (A Demonstration)
The package manager's user-interface (not to be confused with GUI) should be intuitive and easy to use.
The following examples illustrate the interface for using the package manager:
First, you open the Factor UI, then make sure the packages vocabulary is loaded by typing:
USE: packages ;
Now we can start using the package manager words to manage our locally installed vocabularies. The following example illustrates the use of the package management words to perform some package management.
In this scenario, we want to install package Foo4. Package Foo4 depends on the following packages:
- Foo1 - which is laready installed in our local Factor installation.
- Foo2 - which is cached in the local package cache, but not yet installed.
- Foo3 - which is not cached/installed locally.
To install package Foo4, you type:
"foo4" pkg-install
After issuing the pkg-install command, the package manager will see that package Foo1 is already installed, package Foo2 is cached and it should be installed, but not downloaded, and package Foo3 must be downloaded from a remote repository, and installed. Package Foo4 is also not installed/cached, so it will be downloaded as well. The package manager will then attempt to download, verify, and install the required package and all its dependency packages as well.
To update an already-installed package, you type:
"foo4" pkg-update
The pkg-update word sees that we didn't specify a version and automatically searches for the latest version in all remote package repositories. If a newer version of Foo4 is found thah the currently installed one, the existing Foo4 package will be updated to the newer version.
To remove a package, you type:
"foo4" pkg-uninstall
This will remove the Foo4 package and the user will be prompted to remove each dependency or keep the dependency packages.
This is a demonstration of how the package manager will be used by users to manage their packages. There are a lot of complexities to manage in writing a package manager and it is no easy task.
Open Issues
1 issue that is still unresolved is if and how the package manager and Factor is going to support installation of different versions of the same package.
Release Plan
Note that this is a preliminary release plan and will most probably change
.
The 1st release will only contain the basic functionality for managing a remote package repo9sitory and installing, updating, and uninstalling packages from the remote package repositories.
The 2nd release will include automatic dependency resolution
The 3rd release will implement the transaction mechanism
The 4th release will include package signing
The 5th release will add FTP support
Factor Package Manager Preview
Work has been keeping me quite busy lately, but I use my spare tie to write a package manager for Factor. This post provides an extremely early preview of the code so far. The following code is just for the
pkg-install command. Currently implemented is a really bare-bones pkg-install command that will be used to install a package. Any comments / suggestions are welcome
(as always).
USING: kernel http.client system sequences splitting io io.files io.launcher
crypto.md5 ;
IN: packages
<PRIVATE
: factor-base-dir ( -- path )
vm "/" last-split1 drop ;
: base-dir ( -- path )
! push the full path of the package/ dir onto the stack
factor-base-dir "/extra/packages" append ;
: pkg-download ( url -- ? )
! download a package from a remote repo (using the specified URL) to the
! local package cache
dup download-name base-dir "/data/pkg-cache/" rot 3append
download-to
t ;
: pkg-in-cache? ( name -- ? )
! check if a package with the specified name exists in the local package
! cache
base-dir "/data/pkg-cache" append directory
[ first over ".fpkg" append = ] contains? nip ;
: registry-to-lines ( -- seq )
! converts the pkg-registry file to a sequence of lines so it can be
! read / parsed
base-dir "/data/pkg-registry" append <file-reader> lines ;
: pkg-installed? ( name -- ? )
! check if a package is installed by reading the pkg-registry file
registry-to-lines [ over = ] contains? nip ;
: pkg-extract ( name -- ? )
! extracts the files from the package tarball into the data/tmp/ dir
cwd >r
"tar -xjf " base-dir "/data/pkg-cache/" 3append swap ".fpkg" 3append
base-dir "/data/tmp" append cd
run-process drop r> cd
t ;
: (pkg-valid?) ( path -- ? )
cd cwd directory
[
dup second
[ first (pkg-valid?) ".." cd ]
[ first cwd "/" rot 3append dup file>md5str " :: " rot 3append print ] if
] each ;
: pkg-valid? ( name -- ? )
! verify the integrity of the package and return the result
cwd swap base-dir "/data/tmp/" rot 3append (pkg-valid?) cd t ;
: (pkg-install) ( name -- )
! moves the extracted package files to the extra/ dir in the local Factor
! installation
dup base-dir "/data/tmp/" rot 3append dup
rot factor-base-dir "/extra/" rot 3append
copy-tree delete-tree ;
PRIVATE>
: pkg-install ( name -- )
dup pkg-installed? [
">> The package is already installed.\n" write
] [
! sync the local & remote index for the repo
! check the cached repo index for the latest version of the package
! check if that version is cached in the local package cache
! check dependencies of the package to install
! Prompt user if he would like to continue installation
"Would you like to continue the install? [y/n]: " write readln
"n" = [
">> Nothing was installed" print
] [
! download the required packages from the remote repo
">> downloading required packages" print
dup "http://zenhackers.net/files/" swap append ".fpkg" append pkg-download
drop ! drop result of pkg-download word, we should use this result to test for errors
! extract files from packages into their separate directories
">> extracting packages to temporary directory" print
dup pkg-extract drop
! verify integrity of downloaded packages using checksums
">> verifying integrity of packages" print
dup pkg-valid? [
! install the package in the extra/ directory
">> installing packages" print
(pkg-install)
">> install completed" print
] [
">> The package integrity check FAILED !!!" print
] if
] if
] if ;
:: Next >>