Logo

Tree Of Life - Dependency Installer for Scientific Software

Main Index:

  • Description
  • Motivation
  • Vision
  • Workflow
  • Implementation
  • Adopted
  • License

  • Blog

  • Complete Index
  • Tree-of-Life

    Dependency Installer for Scientific Software
    site under construction, some links may not be functional

    Description

    Tree-of-Life is a stand-alone library written in Python that was designed to facilitate the installation of the dependencies required by scientific software which end users are mostly non-developers and to whom complex terminal operations may not be trivial. ToL reduces the task of installation and updating to simple double-click operations. Additionally, it provides a user-friendly interface to operate the installation process.

    As the name implies, it provides a tree of routines that give life to your project (inside user’s computers) :-).

    Note

    Tree-of-Life was initially developed to serve Python-dependent projects; for that reason, it can manage the installation of Python libraries that are required by the host project. However, Tree-of-Life can be extended (or branched to new projects) in the future to serve other languages or installation routines if needed. The following description focus on the current version.


    Motivation

    The continuous interaction with end-users through workshops, presentation seminars, mailists Q&A and others, led us to understand that software installation is a crucial bottle neck between users and the software package itself.

    When asking users to install a software package, which we aim to distribute over a large scientific community, there is a clear difference between those with programming skills, regardless of the skill level, and those without any of these skills. For the later group, simple steps such as installing Anaconda and configuring (conceptualizing and using) a Python environment are definitively not straightforward and can create a great barrier between the user and the software and even drive the users completely away from actually using that software.

    Therefore, when developing a software for a community of users that is not expected (nor required) to have any programming skills to actually use the software, it is necessary to keep, as much as possible, the installation process within the most universal standards, and these are, and have been for decades:

    1. download
    2. unpack
    3. single-click install
    4. single-click run

    Maintaining these standards can be challenging if one considers the diversity of computer platforms available - each user has a different computer/OS/configuration.

    Library vs. Software Package

    An important conceptualization behind the motivation for developing Tree-of-Life is that distributing a Library for programming is different from distributing a Software Package for practical usage; ToL focus on the latter. The confusion arrives, however, from the fact that many scientific software is developed and distributed nowadays as programming libraries (mostly in Python), as such, these are nicely distributed through Pip or Anaconda packages. But, distributing software packages through Pip or Conda might not be the best option (or the only option to consider) when these packages are not aimed to be used as libraries and target, instead, non-developer users.


    Vision

    Vocabulary:

    Aim


    Workflow

    This section explains how Tree-of-Life operates

    Tree-of-Life aims to be platform independent; because it is written in Python, it should run where ever Python can run, Python itself provides an excellent interface between developers and the different OS platforms. Tree-of-Life is written fully in Python an is compatible with Python 2.7 and 3.x series. We consider safe to rely on Python because nowadays virtually every computer (within our scope) has Python installed; installing Python from scratch in the user’s computer lies outside this project’s scope.

    Tree-of-Life is designed to serve Python-based or Python-dependent projects. It installs a Python distribution (using Miniconda) and the required host project’s Python dependencies in an encapsulated environment that resides in the host project unpacked folder. In this way, there is no mixing between the software installation and the users system.

    Also, its architecture allows the developer to write the project’s executable files that are configured accordingly to the project’s dependencies and consequently made available for the user after installation.

    Read on how to implement Tree-of-Life in your project here.

    The Installation step

    The aim of Tree-of-Life is to setup the host project in the users’ computer with the minimum effort from the user by providing double-click (or single-run) installation process, for example:

    python install_the_software.py
    

    or (depending on the developers choice)

    ./install_the_software
    

    If you are a developer, you are invited to read the tree_of_life.py file to understand its workflow from the developer’s point of view.

    From the user’s point of view, the installation workflow is as follows:

    1. Queries user if to install the Python-dependencies automatically or manually.
      1. If, automatic (recommended):
        1. checks available disk space
        2. checks if previous Miniconda installation exists inside the host project folder
        3. Downloads and installs the latest version of Miniconda locally in the project’s folder. The Miniconda version to install is derived from the user’s platform
        4. installs the project’s Python ENV as described in the yml file inside the install folder
      2. If, manual (proficient Python users):
        1. warns the user that s/he must install the required dependencies manually. For proficient Python users/devels this should be straight forward as all the information is stored in the yml file.
        2. Does NOT install any Python packages
    2. Creates the executable files following the executables.py file
      1. executable files are linked to the project’s Python dependencies via shebang
      2. the shebang is defined in agreement to the first main query (Miniconda installation step):
        1. if Miniconda and the Python ENV were installed automatically, shebangs will point to that env
        2. if the user decided to install the Python dependencies manually, the shebangs will point to the system’s current Python executable
        3. In manuall installations, shebangs must also be configured by the user to point to the correct Python environment, in case the Pythonista user wants to use shebangs at all.
      3. exec files are given executable permissions system wide. If you want to restrict exec files permissions you can alter those after installation.
    3. Creates the installation_vars.py file. This file registers all installation variables that are required for updating purposes. If this file is removed or altered manually, future updates will fail an can compromise the whole installation.
    4. All installation output regarding information and debugging is written to a .log file.

    Updater

    Tree-of-Life provides an updater script to allow the user to keep up to date with the host project’s latest version. The updater is part of the executable files and is created during the installation process inside the bin folder.

    The updater was designed to keep the user’s installation up to date with the developers GitHub project. But, by no means it restricts a GitHub repository; by default, the updater simply requires a web link to a ZIP file containing the project. (for developers) You can read the updater workflow in the update_script_code variable inside executables.py.

    From the user’s point of view, the update routines are as follow:

    1. Downloads the latest version of the software
    2. Unpacks it to a temporary folder
    3. Removes the previous version (its folders)
    4. files that do not belong to the project but are stored in the main project folder are not removed during update.
    5. Moves the new version files to the project installation folder
    6. If the installation was performed in automatic mode:
      1. checks if the ENV needs to be updated, if yes, updates it
    7. Rebuilds the executable files. In this way, these are also updated
    8. Rebuilds the installations_vars.py
    9. All updater output regarding information and debugging purposes is written to a .log file.

    Read more on the updater script here.

    The executable files

    By design, Tree-of-Life considers the host project’s executable files as Python scripts (see note). But, this is definitively not mandatory and can be easily adapted to fit other needs.

    Tree-of-Life uses shebangs to point the executable files to the project’s Python ENV, that is, to the correct Python interpreter.

    For UNIX systems the executable files are created without extension, so that the user is naturally driven to execute them as follows:

    .bin/exec_1
    

    On Windows, executable files are created with the .py extension (by default). However, they should not be executed with $ python exec_1.py, because in this way the called Python interpreter will prevail in disregard of the shebang. You should warn the users to use $ exec_1.py directly, so that the system will use the Python ENV pointed by the shebang.


    Implementation

    How to implement Tree-of-Life DISS on your project

    The architecture of Tree-of-Life

    The architecture of Tree-of-Life is straightforward; I invite you to read through the tree_of_life.py script to understand the installation process and its dependencies.

    How to implement Tree-of-Life in your project

    Firstly

    Tree-of-Life is composed of a main installer file tree_of_life.py and an install folder where all its dependencies and template files are stored. To implement Tree-of-Life in your project copy the main installer file and the install folder to your project repository/folder, always according to the project license.

    example:

    MyProject/
        -- MyProject_folder/
            -- (... all your project scr files ...)
        
        -- install/
        -- tree_of_life.py (this is the main installer, renamed it at will)
    

    Secondly

    What must be changed to adapt Tree-of-Life to your project:

    host project variables

    Configure Tree-of-Life according to your project (the host project), it’s very easy, simply update the variables in the host_project_vars.py.

    There is also a BIG BANNER at the end which is displayed as advertising header in the beginning of the installation, change it to fit your project if desired.

    env YML file

    The template_env.yml file describes the Anaconda Python environment that serves your project and which Tree-of-Life will install; in other words, the Python dependencies of your project, configure it according to your needs.

    Explaining an YML file lies outside this wiki, read the Anaconda’s official documentation here.

    The first line of the template_env.yml file, # version: 1, serves updating purposes. If in future releases of your project you update its dependencies, you should increase the yml version number (# version: 2) and the Tree-of-Life updater script will properly update the Miniconda ENV in the user’s installation folder.

    the executables

    By design, Tree-of-Life considers the host project’s executable files as Python scripts. In the vision of the project, if the developer wants to have full control on what the user can do and when, the project’s executable files should not be readily available. Instead, executable files should be created during the installation process and properly linked to the Python ENV’s executable (dependencies).

    To accomplish this, Tree-of-Life is designed such that the executable scripts should be coded as raw strings in the executables.py file and headed by a formatter sign #! {} where the shebang will be inserted. The executables.py file reads very easily and provides executable templates for you to configure according to your project needs.

    Thirdly

    Change the Tree-of-Life library at will but within the LICENSE terms. Some helpful information:

    the main installer

    Feel free to change the main installer name to fit your project requirements, for example, install_mysoftware.py.

    messages.py

    The majority of the messages that are displayed during the installation process are organized inside the messages.py file. Messages and titles can be formatted using the _formats* functions. These messages provide a functional template for you to use right away, yet, you can very easily rewrite these messages to properly fit your project.

    The Updater script

    Also, an updater script is provided in the executables.py. The updater is ready-to-use as long as the related variables in host_project_vars.py are configured according to the host project.

    License headers

    Alter the license headers according to this project license so that a reference to the official work is made and the changes you have implemented are identified. Suggestions:

    THIS FILE WAS ADAPTED FROM TREE-OF-LIFE PROJECT (version 1.1.2 - LGPLv3)
    AND MODIFIED ACCORDINGLY TO THE NEEDS OF THE <your project name> PROJECT.
    
    Visit the original Tree-of-Life project at:
    
    https://github.com/TreeOfLife-diss
    

    Let us know if you implement Tree-of-Life in your project, we would love to add it to our list in the main README file. :-)


    Projects which adopted Tree-of-Life

    Hosting Tree-of-Life in a project does not prevent it being available through other sources, such as, Conda packages or PiPy. Some Scientific projects already use Tree-of-Life to manage their installation and updating routines:


    LICENSE

    Tree-of-Life is licensed under LGPL version 3, and you are allowed to modify and use it according to this license terms.

    LGPL