X-Git-Url: https://git.donarmstrong.com/?p=don.git;a=blobdiff_plain;f=genetics%2Fdbsnp_mirror.mdwn;h=5f15f02ec194586204ec5e1da67223369ca274e7;hp=addc10ea1c9dbebcb901e658c06b8e2bd9a7d829;hb=982ebcf0e294afee8456ccc774468e3108447a85;hpb=92a2a80dbf778263226f2cbdc1e5cd3583432808 diff --git a/genetics/dbsnp_mirror.mdwn b/genetics/dbsnp_mirror.mdwn index addc10e..5f15f02 100644 --- a/genetics/dbsnp_mirror.mdwn +++ b/genetics/dbsnp_mirror.mdwn @@ -8,10 +8,9 @@ download the databases and schemas for the current release of dbsnp. For humans, this is currently ≈60G, and takes a while to retrieve (about 24 hours or so). - for a in {organism,shared}_{data,schema}; do - lftp -c "open ftp.ncbi.nlm.nih.gov; cd /snp/organisms/human_9606/database/; mirror $a"; - done; - + rsync -rvP --include 'organism_*' --exclude '**' rsync://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/database/ . + rsync -rvP rsync://ftp.ncbi.nlm.nih.gov/snp/database/shared_data . + rsync -rvP rsync://ftp.ncbi.nlm.nih.gov/snp/database/shared_schema . Preparing SQL Schemas --------------------- @@ -79,6 +78,45 @@ and constraints. It looks like the following: ${UTIL_DIR}/human_gty1_indexes_creation.pl index |psql snp; ) +Permissions on the database +--------------------------- + +Since I have my database on a server separate from the workstations +(and other machines) that I often do work on, I need remote access to +the database. To make this easy (and avoid having to hard code +database details into the few dozen scripts I use), I created a +postgresql service called *snp*. + +An entry like this: + + [snp] + dbname=snp + user=snpuser + password=somepassword + port=9212 + host=snpdb.donarmstrong.com + +in the pg_service.conf file (in `/etc/postgresql-common` or +`PGSYSCONFDIR`) will configure the service. + +You then need to make sure that the database server is listening on +the appropriate ip address (edit the database's `postgresql.conf` +file), and that *snpuser* has select privileges and can connect. +A line like the following in pg_hba.conf + + host snp snpuser 192.168.0.0/24 md5 + +and the following sql will set that up for you. + + CREATE USER snpuser WITH PASSWORD ('somepassword'); + GRANT SELECT ON ALL TABLES IN SCHEMA public TO snpuser; + +Then, to test, you should be able to run: + + psql "service=snp" -c 'SELECT * FROM snp LIMIT 5'; + +On another machine. + Querying the database ---------------------