With MSQT/Admin you can upload and organize your datasets and you can compile SNPs using
the module 'Snipe4SNPs'. If you are an OS X user please also consult the 'Note to OSX users'
The Dataset Manager provides an overview table of all available datasets you have uploaded
previously with the 'Data Uploader'.
You can change a dataset to be the default dataset by clicking on the 'set default' link in the last column.
The default dataset is the first dataset shown in the dataset selection boxes in MSQT/SBE, MSQT/ADF and MSQT/SNIPED!.
Delete a dataset after a confirmation request with the 'delete' link.
Use the 'Data Uploader' to add your datasets into MSQT. Fill out the form:
Name of dataset: Choose an identifier for your dataset. Use only characters, numbers and underscores, e.g. my_set01
Structure of data: Choose either 'Position name' or 'Chromosome position'.
Chromosomes: If your dataset is in 'Chromosome position' structure, provide the number of Chromosomes (only numbers).
Archive File: The archive containing your data. Please read the section 'MSQT Data Input Format Specification' for further details on the archive format.
Click 'upload' and on the next screen you will be given the opportunity to inspect the names of all individuals
in your dataset detected by MSQT. In case corrections are needed, you will have to return to your original dataset, correct,
create the archive and upload again. Once you pressed 'continue with data processing', MSQT will create the database schema and load the dataset. In case you need
to correct your data beyond this point you will need to also either use a new name for the dataset or delete the previous dataset using the
On an uploaded dataset you can precompile SNPs for the 'SNIPED!' module with 'Snipe4SNPs'.
This will analyze each SNP in the dataset and load the results into
a new table. This table can then be queried via the SNIPED! frontend.
Choose one dataset and click 'compile SNPs' in the 'Action'-Column.
On the next screen you will be asked to provide some parameters. These can be
adjusted, however, it may only be useful for very big datasets. For normal operation
we recommend to use the default settings, since subselections can be performed later
with the SNIPED! frontend.
(default: diff_threshold = 1.1, right_neigbh_threshold = 1, left_neigbh_threshold = 1)
For each SNP position the software will determine the two unambigous alleles (A,G,C,T,-) with allele frequencies closest to 0.5 and
compute the difference (diff) between these two frequencies.
If diff is greater than diff_threshold, this SNP will be skipped and not analyzed.
Setting diff_threshold to 1.0 or above will obviously inactivate this restriction criterion.
Threshold for the right neighborhood length.
This is the minimal count of bases from the current SNP to the next SNP to the right.
Threshold for the left neighborhood length.
The same as the right_neighborhood_threshold, but to the left.
Setting both thresholds equal to 1 will skip indels, which is probably desired.
Keep in mind that MSQT treats any position within an indel as a SNP.
After starting 'Snipe4SNPs' a popup window, the 'Snipe4SNPs Processing Monitor' will appear.
Do not close or reload this window until you read the message 'Snipe4SNPs is finished. You can close this window now'!
This Module provides some information about your database and web server the current MSQT installation is using.
MSQT/Admin 'Data Uploader' requires a compressed archive (zip or tar/gz) of a directory
containing multiple alignment fasta files with a special directory hierarchy.
Use this format if you know the basepair position of your fragment in the reference genome.
There has to be one top level directory which contains as much subdirectories
as chromosomes available.
The subdirectory names must be prefixed with 'chromosome_' and consecutively numbered starting with 1.
The number needs to be an integer, please also assign numbers to the sex chromosomes.
Each filename must be an integer and must correspond to the position of the first
base of the aligned sequences in the reference genome.
Example for my_dogs:
Use this format if you work in an organism without a reference genome.
In this case we obviously do not have chromosomes und hence there are no subdirectories.
One top level directory is to contain all sequence alignment files. The filenames will be used
to compose the unique SNP identifiers.
| |-- 10095031
| |-- 112445
| |-- 197433
| |-- 2079956
| |-- 29215
| |-- 5757110
| |-- 7672503
| `-- 9343234
| |-- 1043454
| |-- 2037121
| |-- 208721
| |-- 347341
| |-- 419364
| |-- 4796716
| |-- 5020871
| `-- 9256049
| |-- 1073635
| |-- 1901310
| |-- 2072648
| |-- 3176691
| |-- 4056767
| |-- 70672
| |-- 8041706
| |-- 9279010
| `-- 964879
| |-- 1055093
| |-- 142440
| |-- 3006871
| |-- 48286
| |-- 5077409
| |-- 7077771
| |-- 8078388
Example for my_cats:
Remark: the dogs and cats datasets are just examples and are
not related to the animals with the same name.
Each file has to contain one multiple alignment from one locus in FASTA format with Unix
linebrakes. One sequence in each file has to be defined as the 'target' sequence.
The sequences must be aligned and must have exactly the same length; please fill with 'N'.
The filenames must not start with a dot ('.') and must not contain dashes ('-').
Each fasta file should contain the same set of individuals, or at least a subset: Files where
individuals are missing will be filled with a sequence containing only Ns automatically during
the data upload process.
Example Fasta file:
No multi-line or interleaved sequences, no comments, no additional newlines are allowed.
You will need to use an editor that is capable of saving textfiles with Unix linebrakes. We recommend
TextWrangler (from Bare Bones Software). Use -> File -> "save as ...", click on "Options" and choose "Linebrakes: Unix".
Do NOT use the right-mouse-click way of compressing your directory (Create Archive of "..."),
because this will create additional directories and files within your archive which will interfere with
the data upload; please open a terminal window (Applications -> Utilities -> Terminal), change into the parent
directory of the directory to be compressed and use tar.
In Os X this will not create additional directories. It will, however, also create additional files, but
the data uploader will ignore those; it also ignores the .DS_Store files.
tar -cvzf directoryname.tar.gz directoryname
For Credits and Copyright notices please see about.html