Profiling API¶

Generic SQLAlchemy-backed profiling service.

class sqldbagent.profile.service.SQLAlchemyProfilingService(engine, inspector, settings=None)[source]¶

Bases: object

Profiling service backed by SQLAlchemy queries.

Parameters:

engine (Engine)
inspector (SQLAlchemyInspectionService)
settings (ProfilingSettings | None, default: None)

__init__(engine, inspector, settings=None)[source]¶

Initialize the profiling service.

Parameters:

engine (Engine) – SQLAlchemy engine used for profiling queries.
inspector (SQLAlchemyInspectionService) – Inspection service used for normalized metadata and relationships.
settings (ProfilingSettings | None, default: None) – Profiling defaults and limits.

Return type:

None

profile_table(table_name, schema=None, *, sample_size=5, top_value_limit=5)[source]¶

Build a normalized table profile.

Parameters:

table_name (str) – Table name to profile.
schema (str | None, default: None) – Optional schema name.
sample_size (int, default: 5) – Number of sample rows to include.
top_value_limit (int, default: 5) – Number of top values to include per column.

Returns:

Profile result for the table.

Return type:

TableProfileModel

sample_table(table_name, schema=None, *, limit=5)[source]¶

Return sample rows from a table.

Parameters:

table_name (str) – Table name to sample.
schema (str | None, default: None) – Optional schema name.
limit (int, default: 5) – Maximum number of rows to return.

Returns:

Sample rows.

Return type:

list[dict[str, object | None]]

get_unique_values(table_name, column_name, schema=None, *, limit=20)[source]¶

Return distinct values and counts for one column.

Parameters:

table_name (str) – Table name to inspect.
column_name (str) – Column name whose distinct values should be returned.
schema (str | None, default: None) – Optional schema name.
limit (int, default: 20) – Maximum number of distinct values to return.

Returns:

Distinct-value distribution for the column.

Return type:

ColumnUniqueValuesModel

Normalized profiling models.

class sqldbagent.core.models.profile.ColumnProfileModel(**data)[source]¶

Bases: BaseModel

Normalized column profile.

Variables:

name – Column name.
data_type – Reflected column data type.
null_count – Exact null count when available.
non_null_count – Exact non-null count when available.
null_ratio – Null ratio when row count is available.
unique_value_count – Exact unique count when available.
unique_ratio – Ratio of unique non-null values to total rows when available.
min_value – Best-effort minimum value.
max_value – Best-effort maximum value.
sample_values – Best-effort sample values for the column.
top_values – Most frequent values and counts.
summary – Generated short summary.

Parameters:

data (Any)
name (str)
data_type (str)
null_count (int | None)
non_null_count (int | None)
null_ratio (float | None)
unique_value_count (int | None)
unique_ratio (float | None)
min_value (object | None)
max_value (object | None)
sample_values (list[object])
top_values (list[dict[str, object]])
summary (str | None)

name: str¶

data_type: str¶

null_count: int | None¶

non_null_count: int | None¶

null_ratio: float | None¶

unique_value_count: int | None¶

unique_ratio: float | None¶

min_value: object | None¶

max_value: object | None¶

sample_values: list[object]¶

top_values: list[dict[str, object]]¶

summary: str | None¶

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class sqldbagent.core.models.profile.ColumnUniqueValuesModel(**data)[source]¶

Bases: BaseModel

Normalized unique-values payload for one column.

Variables:

database – Optional database name containing the table.
schema_name – Optional schema containing the table.
table_name – Table name containing the column.
column_name – Column name whose values were inspected.
row_count – Exact table row count when available.
null_count – Exact null count for the column when available.
non_null_count – Exact non-null count for the column when available.
unique_value_count – Exact number of distinct non-null values.
values – Distinct values with their frequencies.
truncated – Whether values was limited by the caller-supplied cap.
summary – Generated short summary.

Parameters:

data (Any)
database (str | None)
schema_name (str | None)
table_name (str)
column_name (str)
row_count (int | None)
null_count (int | None)
non_null_count (int | None)
unique_value_count (int | None)
values (list[dict[str, object]])
truncated (bool)
summary (str | None)

database: str | None¶

schema_name: str | None¶

table_name: str¶

column_name: str¶

row_count: int | None¶

null_count: int | None¶

non_null_count: int | None¶

unique_value_count: int | None¶

values: list[dict[str, object]]¶

truncated: bool¶

summary: str | None¶

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class sqldbagent.core.models.profile.TableProfileModel(**data)[source]¶

Bases: BaseModel

Normalized cheap table profile.

Variables:

database – Optional database name containing the table.
schema_name – Optional schema name containing the table.
table_name – Table name.
row_count – Exact row count when available.
row_count_exact – Whether the row count is exact.
storage_bytes – Best-effort storage bytes when available.
storage_scope – Scope represented by storage_bytes.
storage_source – How storage bytes were obtained.
entity_kind – Heuristic entity classification for the table.
related_tables – Related tables inferred from foreign keys.
relationships – Relationships inferred from foreign keys.
relationship_count – Number of inferred relationships.
columns – Per-column profile summaries.
sample_rows – Sample rows from the table.
summary – Generated short summary.

Parameters:

data (Any)
database (str | None)
schema_name (str | None)
table_name (str)
row_count (int | None)
row_count_exact (bool)
storage_bytes (int | None)
storage_scope (str | None)
storage_source (str | None)
entity_kind (str | None)
related_tables (list[str])
relationships (list[ForeignKeyModel])
relationship_count (int)
columns (list[ColumnProfileModel])
sample_rows (list[dict[str, object | None]])
summary (str | None)

database: str | None¶

schema_name: str | None¶

table_name: str¶

row_count: int | None¶

row_count_exact: bool¶

storage_bytes: int | None¶

storage_scope: str | None¶

storage_source: str | None¶

entity_kind: str | None¶

related_tables: list[str]¶

relationships: list[ForeignKeyModel]¶

relationship_count: int¶

columns: list[ColumnProfileModel]¶

sample_rows: list[dict[str, object | None]]¶

summary: str | None¶

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].