Overview
The Metadata class can be used in order to organize lot’s of Buffer files.
For example you might have a project generated by the Analyzer4D Software and you want to search through all buffer files
in order to find all buffers with a certain compression.
You could go through each file and check out if the compression matches your requirement which takes a lot of time.
The cache creates a database that caches all the metadata available in the qass.tools.analyzer.buffer_parser.Buffer
class.
You can then query the cache very fast and programmatically in Python!
Example
In this example we create an instance of the cache and synchronize it with the directory my/directory.
We then create a template BufferMetadata object and use it with the cache to query for all buffers with a compression_frq = 8.
You can use all properties that are in BufferMetadata.properties
.
The BufferMetadataCache.get_matching_buffers()
method returns a list of Buffer
objects
that are in this case sorted by their process number (as specified in the sort_key).
1from qass.tools.analyzer.buffer_metadata_cache import BufferMetadataCache as BMC, BufferMetadata as BM, select
2
3cache = BMC()
4cache.synchronize_directory("my/directory")
5
6results = cache.get_matching_buffers(query=select(BM).filter(BM.compression_frq==8).order_by(BM.process))
Example (Deprecated)
1from qass.tools.analyzer.buffer_metadata_cache import BufferMetadataCache as BMC, BufferMetadata as BM
2from qass.tools.analyzer.buffer_parser import Buffer
3
4cache = BMC(BMC.create_session(), Buffer)
5cache.synchronize_directory("my/directory")
6
7buffer_metadata = BM(compression_frq = 8)
8results = cache.get_matching_buffers(buffer_metadata, sort_key = lambda bm: bm.process)
BufferMetadataCache
- class qass.tools.analyzer.buffer_metadata_cache.BufferMetadataCache(session=None, Buffer_cls=<class 'qass.tools.analyzer.buffer_parser.Buffer'>, db_url='sqlite:///:memory:')[source]
This class acts as a Cache for Buffer Metadata. It uses a database session with a buffer_metadata table to map metadata to files on the disk. The cache can be queried a lot faster than manually opening a lot of buffer files.
- class BufferMetadata(**kwargs)
This class acts as a template for buffer files. It’s properties represent all available metadata of a buffer file. This class is used internally as a database model and can be instantiated to provide a template for a buffer file by populating desired properties and passing the object to the cache which will in turn create a query based on this object.
- static buffer_to_metadata(buffer)
Converts a Buffer object to a BufferMetadata database object by copying all the @properties from the Buffer object putting them in the BufferMetadata object
- Parameters:
buffer (buffer_parser.Buffer) – Buffer object
- add_files_to_cache(files, verbose=0, batch_size=1000, machine_id=None)[source]
Add buffer files to the cache by providing the complete filepaths
- Parameters:
files (list, tuple of str) – complete filepaths that are added to the cache. The filepath is used with the Buffer class to open a buffer and extract the header information.
verbose (int, optional) – verbosity level. 0 = no feedback, 1 = progress bar
- static create_session(engine=None, db_url='sqlite:///:memory:')[source]
Create a session and initialize the schema for the BufferMetadataCache. If an engine is provided the schema will be expanded by the buffer_metadata table.
- Parameters:
engine – An instance of a sqlalchemy engine. Typically sqlalchemy.create_engine()
db_url (str) – The string used to create the engine. This can be a psycopg2, mysql or sqlite3 string. The default will create the database in main memory.
- Returns:
A sqlalchemy session instance
- Return type:
sqlalchemy.orm.Session
- get_buffer_metadata_query(buffer_metadata)[source]
Converts a .. py:class:: BufferMetadata object to a complete query. Every property of the object will be converted into SQL and returned as a ..py:class:: sqlalchemy.orm.query.FromStatement object
- Parameters:
buffer_metadata (BufferMetadata) – The template BufferMetadata object.
- Returns:
The sqlalchemy query object
- Return type:
sqlalchemy.orm.query.FromStatement
- get_matching_buffers(buffer_metadata: BufferMetadata | None = None, filter_function: Callable | None = None, sort_key: Callable | None = None, query: Select | None = None)[source]
Calls get_matching_files and converts the result to Buffer objects
- Returns:
List of Buffer objects
- Return type:
list
- get_matching_files(buffer_metadata: BufferMetadata | None = None, filter_function: Callable | None = None, sort_key: Callable | None = None, query: Select | None = None)[source]
Query the Cache for all files matching the properties that selected by the query object. The usage of the buffer_metadata, filter_functions and sort_key is deprecated and will be removed in two minor versions. Use the sqlalchemy query parameter instead.
1BufferMetadataCache.get_matching_files( 2 select(BM).filter(BM.channel==1, BM.compression_freq==4, BM.process > 100) 3) 4# Returns all buffer filepaths with channel = 1, A frequency compression of 4, 5# processes above 100 sorted by the process number
- Parameters:
buffer_metadata (BufferMetadata) – A metadata object acting as the filter. Only buffers matching the attributes of the provided BufferMetadata object are selected. This operation is done on the database
filter_function (function) – A function taking a BufferMetadata object as a parameter returning a boolean. This means a conjunction of BufferMetadata attributes.
sort_key (function) – A function taking a BufferMetadata object as a parameter returning an attribute the objects can be sorted with
query (Select) – A sqlalchemy select statement specifying the properties of the BufferMetadata objects
- Returns:
A list with the paths to the buffer files that match the buffer_metadata
- Return type:
list[str]
- get_matching_metadata(buffer_metadata: BufferMetadata | None = None, filter_function: Callable | None = None, sort_key: Callable | None = None, query: Select | None = None)[source]
Query the cache for all BufferMetadata database entries matching
- Parameters:
query (Select) – A sqlalchemy select statement specifying the properties of the BufferMetadata objects
- Returns:
A list with the paths to the buffer files that match the buffer_metadata
- Return type:
list[str]
- get_non_synchronized_files(files, machine_id)[source]
calculate the difference between the set of files and the set of synchronized files
- Parameters:
files (str) – filenames
- Returns:
The set of files that are not synchronized, and the database entries that exist but the file is not present anymore
- remove_files_from_cache(files, verbose=0)[source]
Remove synchronized files from the cache
- Parameters:
files (list, tuple of str) – complete filepaths that are present in the cache
verbose (int, optional) – verbosity level. 0 = no feedback, 1 = progress bar
- static split_filepath(filepath)[source]
Splits a filepath to folder and filename and returns them as a tuple
- Parameters:
filepath (str) – _description_
- Returns:
A tuple containing (directory_path, filename) as strings
- Return type:
tuple(str)
- synchronize_directory(*paths, sync_subdirectories=True, regex_pattern='^.*[p][0-9]*[c][0-9]{1}[b][0-9]{2}', verbose=1, delete_stale_entries=False, machine_id=None)[source]
synchronize the buffer files in the given paths with the database matching the regex pattern
- Parameters:
paths (str) – The absolute paths to the directory
recursive (bool, optional) – When True synchronize all of the subdirectories recursively, defaults to True
regex_pattern (string, optional) – The regex pattern validating the buffer naming format
verbose (int, optional) – verbosity level. 0 = no feedback, 1 = progress bar
machine_id (string, optional) – An optional identifier for a certain machine to enable synchronization of different platforms
BufferMetadata
- class qass.tools.analyzer.buffer_metadata_cache.BufferMetadata(**kwargs)[source]
This class acts as a template for buffer files. It’s properties represent all available metadata of a buffer file. This class is used internally as a database model and can be instantiated to provide a template for a buffer file by populating desired properties and passing the object to the cache which will in turn create a query based on this object.
- static buffer_to_metadata(buffer)[source]
Converts a Buffer object to a BufferMetadata database object by copying all the @properties from the Buffer object putting them in the BufferMetadata object
- Parameters:
buffer (buffer_parser.Buffer) – Buffer object
- qass.tools.analyzer.buffer_metadata_cache.get_declarative_base()[source]
Getter for the declarative Base that is used by the
BufferMetadataCache
.- Returns:
declarative base class