PROTOCOL TBro Amazon Lightsail Template 0.5
Introduction
This protocol will setup up TBro in Docker in Amazon AWS Lightsail. It will load in transcripts, peptides, and an InterProScan annotation, and create a BLAST database server. It does not yet include pathways and expression features of TBro but will in the future. Items in red must be replaced with your own information or files.
Materials
Transcriptome transcripts and peptides
T1-transcriptome.tr
T1-transcriptome.aa
TBro peptide tables (see TBro documentation)
InterProScan annotation (see InterProScan documentation)
Zipped BLAST databases and md5sums (see TBro documentation)
Procedure
Create an Amazon AWS account ( requires phone for security check )
Create an Amazon AWS Lightsail instance ( under Services > OS only > Ubuntu )
https://lightsail.aws.amazon.com/ls/webapp/home/resources
https://lightsail.aws.amazon.com/ls/webapp/home/resources
Assign the static IP to your Lightsail instance
Follow directions on Amazon AWS Lightsail
SSH to Amazon AWS Lightsail instance
Uninstall any old CE versions of Docker
sudo apt-get purge docker-ce
sudo rm -rf /var/lib/docker
sudo apt-get remove docker docker-engine
Select Ubuntu docker installation instructions
sudo apt-get install apt-transport-https ca-certificates curl software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
Check that key fingeprint is 9DC8 5822 9FC7 DD38 854A E2D8 8D81 803C 0EBF CD88
sudo apt-key fingerprint 0EBFCD88
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
Install Docker CE Unbuntu AMD64
sudo apt-get install docker-ce
Add your user to user group
sudo usermod -aG docker $USER
Connect to your instance using ssh from Amazon Lightsail interface
close the Lightsail terminal and then re-connect by SSH from Lightsail
Run hello world from your Docker user group user
Configure system to start Docker when it boots up
sudo systemctl enable docker
docker pull greatfireball/generic_postgresql_db
docker pull tbroteam/generic_chado_db_reload
docker pull tbroteam/tbro_worker_ftp
docker pull tbroteam/tbro_worker
docker pull tbroteam/tbro_apache
Start CHADO database and install schema
docker run -d -e DB_NAME=chado -e DB_USER=tbro -e DB_PW=tbro --name "Chado_DB_4_TBro_official" greatfireball/generic_postgresql_db
docker run --rm -i -t --link Chado_DB_4_TBro_official:CHADO --name "Chado_DB_4_TBro_load_official" tbroteam/generic_chado_db_reload
Start database container for BLAST worker
docker run -d -e DB_NAME=worker -e DB_USER=worker -e DB_PW=worker --name "Worker_DB_4_TBro_official" greatfireball/generic_postgresql_db
Start FTP server to host BLAST database
docker run -d --name "Worker_FTP_4_TBro_official" -e FTP_USER="tbro" -e FTP_PW="ftp" tbroteam/tbro_worker_ftp
Start worker to execute BLAST server
docker run -d --link Worker_DB_4_TBro_official:WORKER --link Worker_FTP_4_TBro_official:WORKERFTP --name "TBro_Worker_official" tbroteam/tbro_worker
docker exec -i -t TBro_Worker_official /home/tbro/worker_build_installation.sh
Build TBro Docker container
docker run -d --link Chado_DB_4_TBro_official:CHADO --link Worker_FTP_4_TBro_official:WORKERFTP --link Worker_DB_4_TBro_official:WORKER --name "TBro_official" -p 80:80 tbroteam/tbro_apache
docker exec -i -t TBro_official /home/tbro/build_installation.sh
Start TBro Docker container
docker exec -it TBro_official /bin/bash
Install apache2 utilities
sudo apt-get install apache2-utils
sudo htpasswd -c /etc/apache2/.htpasswd USER
Modify Apache(?) configuration file
sudo apt-get install nano
sudo nano /etc/apache2/sites-enabled/000-default.conf
# The ServerName directive sets the request scheme, hostname and port that
# the server uses to identify itself. This is used when creating
# redirection URLs. In the context of virtual hosts, the ServerName
# specifies what hostname must appear in the request's Host: header to
# match this virtual host. For the default virtual host (this file) this
# value is not decisive as it is used as a last resort host regardless.
# However, you must set it for any further virtual host explicitly.
#ServerName www.example.com
ServerAdmin webmaster@localhost
DocumentRoot /var/www/html
# Available loglevels: trace8, ..., trace1, debug, info, notice, warn,
# error, crit, alert, emerg.
It is also possible to configure the loglevel for particular
ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined
# For most configuration files from conf-available/, which are
# enabled or disabled at a global level, it is possible to
# include a line for only one particular virtual host. For example the
# following line enables the CGI configuration for this host only
# after it has been globally disabled with "a2disconf".
#Include conf-available/serve-cgi-bin.conf
#### BEGIN: ADD THIS TEXT TO FILE
<Directory "/var/www/html">
AuthName "Restricted Content"
AuthUserFile /etc/apache2/.htpasswd
#### END: ADD THIS TEXT TO FILE
# vim: syntax=apache ts=4 sw=4 sts=4 sr noet
Test syntax and restart TBro
sudo apache2ctl configtest
docker restart TBro_official
Enter TBro Docker container
docker exec -it TBro_official /bin/bash
Add species information to CHADO in TBro database
tbro-db organism insert --genus Octopus --species chierchiae --common_name pygmy octopus --abbreviation O.chierchiae
Check organism table for organism id
OPTIONAL To update/correct errors in organism table
tbro-db organism update --genus Octopus --species chierchiae --common_name tiny octopus --abbreviation O.chierchiae
Build directory system in TBro Docker container
Copy transcript files into TBro Docker container
scp id@address:/path/to/transcripts/T1-transcriptome.tr .
Create list of all identifiers for each transcriptome
grep ">" T1-transcriptome.tr | perl -pe 's/>(\S+).*/$1/' > T1-identifiers.ids
Import transcript sequence ids into TBro database
tbro-import sequence_ids --organism_id 13 --release octopus-T1 --file_type only_isoforms T1-identifiers.ids
Import transcript sequences into TBro database
tbro-import sequences_fasta --organism_id 13 --release octopus-T1 T1-transcriptome.tr
Copy TBro peptide fasta files into Docker TBro Docker container
scp id@address:/path/to/peptides/T1-transcriptome.aa .
Copy TBro peptide tbl files into Docker TBro Docker container
scp id@address:/path/to/peptides/T1-tbro-table.tbl .
Import peptide table into Tbro database
tbro-import peptide_ids --organism_id 13 --release octopus-T1 T1-tbro-table.tbl
Import peptide sequences into Tbro database
tbro-import sequences_fasta --organism_id 13 --release octopus-T1 T1-transcriptome.aa
Copy InterProScan tsv files into TBro Docker container
scp id@address:/path/to/annotations/T1-annotations.tsv .
Import InterProScan annotations into Tbro database
tbro-import annotation_interpro --organism_id 13 --release octopus-T1 -i interproscan-5.22-61 T1-annotations.tsv
Pull zipped blast databases and md5sums files into TBro Docker container
scp id@address:/path/to/transcript/blastdbs/T1-blastdb-TR.zip .
scp id@address:/path/to/peptides/blastdbs/T1-blastdb-AA.zip .
scp id@address:/path/to/blastdb/md5sums/list-zipped-blastdb-md5sums .
Copy zipped blast databases into Docker FTP container
curl --data-binary --ftp-pasv --user $WORKERFTP_ENV_FTP_USER:$WORKERFTP_ENV_FTP_PW -T T1-blastdb-TR.zip
curl --data-binary --ftp-pasv --user $WORKERFTP_ENV_FTP_USER:$WORKERFTP_ENV_FTP_PW -T T1-blastdb-AA.zip ftp://"$WORKERFTP_PORT_21_TCP_ADDR"/
Update the queue_config.sql
cp queue_config.example.sql queue_config.sql
-- database files available. name is the name it will be referenced by, md5 is the zip file's sum, download_uri specifies where the file can be retreived
INSERT INTO database_files (name, md5, download_uri) VALUES
### database_files( name ) is the name of the zipped blastdb file without '.zip' - and is referred to as program_database_relationships( database_name ) below
('T1-blastdb-TR', 'becd11699b54377106482fdc7f54906d', 'ftp://172.17.0.4/T1-blastdb-TR.zip'),
('T1-blastdb-AA', '52acbc6e906029a04ea3ac850bb75674', 'ftp://172.17.0.4/T1-blastdb-AA.zip');
-- contains information which program is available for which program.
-- additionally, 'availability_filter' can be used to e.g. restrict use for a organism-release combination INSERT INTO program_database_relationships (programname, database_name, availability_filter) VALUES
### availability_filter is ( organism_id )_( release ) from earlier steps
('blastn', 'T1-blastdb-TR', '13_octopus-T1'),
('blastp', 'T1-blastdb-AA', '13_octopus-T1'),
('blastx', 'T1-blastdb-AA', '13_octopus-T1'),
('tblastn', 'T1-blastdb-TR', '13_octopus-T1'),
('tblastx', 'T1-blastdb-TR', '13_octopus-T1');
Import configuration file
PGPASSWORD=$WORKER_ENV_DB_PW psql -U $WORKER_ENV_DB_USER -h $WORKER_PORT_5432_TCP_ADDR -p $WORKER_PORT_5432_TCP_PORT <queue_config.sql
Log in to your TBro database
Enter your Amazon AWS Lightsail static IP address (see Step 4) into a web browser
Enter your user and password information (see Step 26)