Name | extract |
Description | extract is a part of the Glimmer package, for finding genes in microbial DNA, especially the genomes of bacteria, archaea, and viruses. The program extract takes a FASTA format sequence file and a file with a list of start/stop positions in that file (e.g., as produced by the long-orfs program) and extracts and outputs the specified sequences. The first command-line argument is the name of the sequence file, which must be in FASTA format. The second command-line argument is the name of the coordinate file. It must contain a list of pairs of positions in the first file, one per line. The format of each entry is:
This file should contain no other information, so if you're using the output of glimmer or long-orfs , you'll have to cut off header lines. The output of the program goes to the standard output and has one line for each line in the coordinate file. Each line contains the IDstring , followed by white space, followed by the substring of the sequence file specified by the coordinate pair. Specifically, the substring starts at the first position of the pair and ends at the second position (inclusive). If the first position is bigger than the second, then the DNA reverse complement of each position is generated. Start/stop pairs that "wrap around" the end of the genome are allowed. There are two optional command-line arguments:
References: |
Homepage | http://www.tigr.org/software/glimmer/ |
Remote Documentation | http://www.tigr.org/software/glimmer/glimmer.readme |