disease.database.dynamodb#
Provide DynamoDB client.
- class disease.database.dynamodb.DynamoDbDatabase(db_url=None, **db_args)[source]#
Disease Normalizer database client for DynamoDB.
- __init__(db_url=None, **db_args)[source]#
Initialize Database class.
- Parameters:
db_url (str) – URL endpoint for DynamoDB source
- Keyword Arguments:
region_name: AWS region (defaults to “us-east-2”)
- add_merged_record(record)[source]#
Add merged record to database.
- Parameters:
record (
Dict
) – merged record to add- Return type:
None
- add_record(record, src_name)[source]#
Add new record to database.
- Parameters:
record (Dict) – record to upload
src_name (SourceName) – name of source for record
- Return type:
None
- add_source_metadata(src_name, meta)[source]#
Add new source metadata entry.
- Parameters:
src_name (
SourceName
) – name of sourcemeta (
SourceMeta
) – known source attributes
- Raises:
DatabaseWriteException – if write fails
- Return type:
None
- check_schema_initialized()[source]#
Check if database schema is properly initialized.
- Return type:
bool
- Returns:
True if DB appears to be fully initialized, False otherwise
- check_tables_populated()[source]#
Perform rudimentary checks to see if tables are populated.
Emphasis is on rudimentary – if some fiendish element has deleted half of the disease aliases, this method won’t pick it up. It just wants to see if a few critical tables have at least a small number of records.
- Return type:
bool
- Returns:
True if queries successful, false if DB appears empty
- close_connection()[source]#
Perform any manual connection closure procedures if necessary.
- Return type:
None
- complete_write_transaction()[source]#
Conclude transaction or batch writing if relevant.
- Return type:
None
- delete_normalized_concepts()[source]#
Remove merged records from the database. Use when performing a new update of normalized data.
- Raises:
DatabaseReadException – if DB client requires separate read calls and encounters a failure in the process
DatabaseWriteException – if deletion call fails
- Return type:
None
- delete_source(src_name)[source]#
Delete all data for a source. Use when updating source data.
- Parameters:
src_name (
SourceName
) – name of source to delete- Raises:
DatabaseReadException – if DB client requires separate read calls and encounters a failure in the process
DatabaseWriteException – if deletion call fails
- Return type:
None
- drop_db()[source]#
Delete all tables from database. Requires manual confirmation.
- Raises:
DatabaseWriteException – if called in a protected setting with confirmation silenced.
- Return type:
None
- export_db(export_location)[source]#
Dump DB to specified location. Not available for DynamoDB database backend.
- Parameters:
export_location (
Path
) – path to save DB dump at- Return type:
None
- get_all_concept_ids(source=None)[source]#
Retrieve concept IDs for use in generating normalized records.
- Parameters:
source (
Optional
[SourceName
]) – optionally, just get all IDs for a specific source- Return type:
Set
[str
]- Returns:
Set of concept IDs as strings.
- get_all_records(record_type)[source]#
Retrieve all source or normalized records. Either return all source records, or all records that qualify as “normalized” (i.e., merged groups + source records that are otherwise ungrouped). For example,
>>> from disease.database import create_db >>> from disease.schemas import RecordType >>> db = create_db() >>> for record in db.get_all_records(RecordType.MERGER): >>> pass # do something
- Parameters:
record_type (
RecordType
) – type of result to return- Return type:
Generator
[Dict
,None
,None
]- Returns:
Generator that lazily provides records as they are retrieved
- get_record_by_id(concept_id, case_sensitive=True, merge=False)[source]#
Fetch record corresponding to provided concept ID
- Parameters:
concept_id (str) – concept ID for disease record
case_sensitive (bool) – if true, performs exact lookup, which is more efficient. Otherwise, performs filter operation, which doesn’t require correct casing.
merge (bool) – if true, look for merged record; look for identity record otherwise.
- Return type:
Optional
[Dict
]- Returns:
complete record, if match is found; None otherwise
- get_refs_by_type(search_term, ref_type)[source]#
Retrieve concept IDs for records matching the user’s query. Other methods are responsible for actually retrieving full records.
- Parameters:
search_term (
str
) – string to match againstref_type (
RefType
) – type of match to look for.
- Return type:
List
[str
]- Returns:
list of associated concept IDs. Empty if lookup fails.
- get_source_metadata(src_name)[source]#
Get license, versioning, data lookup, etc information for a source.
- Parameters:
src_name (
Union
[str
,SourceName
]) – name of the source to get data for- Return type:
Optional
[SourceMeta
]- Returns:
source metadata, if lookup is successful
- list_tables()[source]#
Return names of tables in database.
- Return type:
List
[str
]- Returns:
Table names in DynamoDB
- load_from_remote(url=None)[source]#
Load DB from remote dump. Not available for DynamoDB database backend.
- Parameters:
url (
Optional
[str
]) – remote location to retrieve gzipped dump file from- Return type:
None
- update_merge_ref(concept_id, merge_ref)[source]#
Update the merged record reference of an individual record to a new value.
- Parameters:
concept_id (
str
) – record to updatemerge_ref (
Any
) – new ref value
- Raises:
DatabaseWriteException – if attempting to update non-existent record
- Return type:
None