Title: Using the Core Facility HansErik G' Aronson
1Using the Core FacilityHans-Erik G. Aronson
http//amdec-bioinfo.cu-genome.org
2Core Facility User Accounts
On line application http//amdec-bioinfo.cu
-genome.org
-
- Short description of new projects for reporting
purposes. - Contact info for person requesting the account.
- Funding source and number etc.
- PI on a project.
-
- In general, each person using the machines
should have their own account. - If shared accounts are used, the named person
should be easily reachable - to deal with problems.
- Larger projects may have own userid. First
login as yourself, then su to - project.
3Use of Accounts
- These accounts are only for working on AMDeC
projects. - Not for storing personal files, music files
etc. - If you need software installed, please contact
us. - Development work (compiling and testing software)
should - be done on our test machines. Contact us for
access.
4Grants and Publications
- Please cite the AMDeC Bioinformatics Core
Facility at the Columbia Genome Center in any
publications arising out of work here. - Please include funding requests for use of the
Facility in grants (Details negotiated for
individual projects).
5Access
- WWW Interfaces
- Direct Login (Unix)
- SOAP (Simple Object Access Protocol) client/server
6WWW Interfaces
blaster.cu-genome.org
- BlastMachine NCBI BLAST interface
- GeneMatcher2 Comprehensive access to all machine
capabilities and runtime information. - Searches
- Queue Management
- File management
7blaster.cu-genome.org
8Paracel BLAST
9blaster.cu-genome.org
10GeneMatcher2 BioView Workbench
11GeneMatcher2 New Search Available Algorithms
12GeneMatcher2 New Search
13Smith-Waterman DNA - Search Submission Screen
14Smith-Waterman DNA - Search Status
BioView Workbench saves results until explicitly
deleted please do so!
15Smith-Waterman DNA - Hits
16Database Security
At present, databases you might install on the
GeneMatcher2 are not secured from other users.
If important, ask us first!
17Sequence Databases
NCBI (weekly updates) blastdb
Databases EMBL IPI International Protein
Index
nonredundant, curated
(SwissProt, TrEMBL,
RefSeq and Ensembl) TIGR UCSC Golden
Path/NCBI Human and Mouse assemblies DOE
JGI Fugu (Pufferfish) assembly WashU Pfam
HMMs (GeneMatcher2 only) NCBI TraceDB
Genomic Mouse Reads (BlastMachine
only) Other databases as needed for specific
projects.
18Direct Login (UNIX)
adredhat.cu-genome.org
- We require Secure Shell - ssh and sftp.
- Available from
- www.ssh.com or www.openssh.org
19Command Line Use
- BlastMachine pb
- pb blastall p blastn d ncbi/nt i nucseq.fasta
o nucseq.out - (same options as NCBI BLAST)
- GeneMatcher2 btk
- btk swp dnr qprotseq.fasta mblosum62
outprotseq.out - All additional software - PTA, PGA, PFP, BioPerl,
MUMmer, etc.
20BlastMachine Database Directory Structure
db1/ ncbi/ embl/ genomes -gt ../genomes/ tigr -
gt ../tigr/ user -gt ../user/ projects -gt
../projects/ other -gt ../other/ db2/ ncbi/ em
bl/ genomes -gt ../genomes/ tigr -gt
../tigr/ user -gt ../user/ projects -gt
/projects/ other -gt ../other/
21Sequence Size Limits
We are providing the human and mouse assembled
chromosomes in a 100K fragment size with 10K
overlap between fragments.
22Passwords
- Two separate authentication systems
- Direct login (UNIX) and BlastMachine
- GeneMatcher2 Web Interface - (BioView Workbench)
- If you change one, change the other to avoid
confusion.
23Changing Passwords - UNIX
Direct login BlastMachine
- On adredhat.cu-genome.org
- Enter yppasswd
-
24Changing Passwords GM2
blaster.cu-genome.org
25Large Jobs
Submitting large numbers of queries in one job
allows most efficient use of the
machines. Allows GM2 to keep its pipeline
full. Allows BlastMachine to most efficiently
use databases loaded into memory. But please
ask before starting a job that might run for days
or weeks!
26On the Horizon
- Sun Grid Engine
- For access to Sun Fire V880
- Currently being configured.
- Bbq (Beowulf Batch Queue)
- Will be used on the future Beowulf System
27SOAP Client
- We have developed a SOAP (Simple Object Access
Protocol) client which can be incorporated into
Perl scripts run on remote hosts e.g. at your
home institutions. - Supports calls to the BlastMachine (pb) and the
GeneMatcher2 (btk). - It is still experimental!
28Backup Scheme
Full backup to tape every 15 days. Daily backups
to dedicated backup fileserver.
29(No Transcript)