Protein Sequence
Type Registry
Protein Sequence
representation.scientific.protein_sequence Amino acid sequence in single-letter IUPAC notation (20 standard amino acids plus ambiguity codes).
Domain representation›scientific
Casts to VARCHAR
Scope Universal
Try it
CLI
$ finetype infer -i "MKVLLIVGS"
→ representation.scientific.protein_sequence DuckDB
Detect
SELECT finetype('MKVLLIVGS');
-- → 'representation.scientific.protein_sequence' Cast expression
UPPER(CAST({col} AS VARCHAR)) Safe cast pipeline
-- Normalise and cast in one step
SELECT TRY_CAST(finetype_cast(my_column) AS VARCHAR) AS clean_value
FROM my_table
WHERE finetype(my_column) = 'representation.scientific.protein_sequence'; Struct Expansion
length: LENGTH({col})
molecular_weight_estimate: LENGTH({col}) * 110 JSON Schema
finetype schema representation.scientific.protein_sequence {
"$id": "https://noon.sh/schemas/representation.scientific.protein_sequence",
"$schema": "https://json-schema.org/draft/2020-12/schema",
"description": "Amino acid sequence in single-letter IUPAC notation (20 standard amino acids plus ambiguity codes).",
"examples": [
"MKVLLIVGS",
"ACDEFGHIKLMNPQRSTVWYFL",
"MPKTAYIAKQRQISFVKSHFSRQLEERLGLIEVQAPILSRVG"
],
"pattern": "^[ACDEFGHIKLMNPQRSTVWXY*]+$",
"title": "Protein Sequence",
"type": "string"
} Examples
MKVLLIVGSACDEFGHIKLMNPQRSTVWYFLMPKTAYIAKQRQISFVKSHFSRQLEERLGLIEVQAPILSRVG Also known as
proteinpeptideamino_acid_sequence