filoma — Quick interactive examples¶
This notebook demonstrates key filoma capabilities and includes lightweight checks to see if it works in your environment.
It covers: imports and version checks, probing a file and a directory, working with the filoma.DataFrame wrapper, using probe_to_df, a small image probe example, and saving a CSV export.
Note: cells wrap operations in try/except so the notebook still runs if optional dependencies (e.g. polars, numpy, or image backends) are missing.
# Basic environment and import checks
from pathlib import Path
import filoma
from filoma import DataFrame
def check_imports():
results = {}
try:
import filoma
results["filoma"] = getattr(filoma, "__version__", "unknown")
except Exception as e:
results["filoma"] = f"IMPORT ERROR: {e}"
for pkg in ("polars", "numpy", "PIL"):
try:
__import__(pkg if pkg != "PIL" else "PIL.Image")
results[pkg] = "available"
except Exception as e:
results[pkg] = f"missing ({e})"
# show where we are running the notebook from
results["cwd"] = str(Path(".").resolve())
return results
check_imports()
1) Quick probe: a single file and a directory¶
Try probing a README or small file, then probe a lightweight sample directory from the repo's tests/ tree.
file_candidate = "../README.md"
dir_candidate = "../tests/"
print("probing file ->", file_candidate)
if file_candidate is not None:
try:
file_report = filoma.probe(file_candidate)
print("file probe result type:", type(file_report))
try:
# many filoma dataclasses implement a nice repr or to-dict
print(file_report)
except Exception:
pass
except Exception as e:
print("file probe failed:", e)
else:
print("No small file found to probe in the repository root.")
print("probing directory ->", dir_candidate)
if dir_candidate is not None:
try:
dir_report = filoma.probe(dir_candidate, max_depth=2, threads=2)
print("directory probe returned an object of type:", type(dir_report))
# If it exposes a to_df() method we can inspect a little
if hasattr(dir_report, "to_df"):
try:
dfw = dir_report.to_df()
print("to_df() -> wrapper type:", type(dfw))
except Exception as e:
print("to_df() raised:", e)
except Exception as e:
print("directory probe failed:", e)
else:
print("No small directory found to probe in tests/; adjust the path and re-run.")
2) Working with filoma.DataFrame wrapper¶
Construct a filoma.DataFrame from a list of paths and run the convenience enrichers: .add_path_components(), .add_file_stats_cols(), and .add_depth_col().
sample_paths = [p for p in (Path("../README.md"), Path("../pyproject.toml"), Path("../Cargo.toml")) if p.exists()]
if not sample_paths:
# fallback to a couple of files from tests if present
sample_paths = [p for p in (Path("../tests/test_basic_dataframe.py"), Path("../tests/test_rust_comprehensive.py")) if p.exists()]
print("sample paths used:", sample_paths)
dfw = DataFrame(sample_paths)
print("Initial wrapper and head:")
print(dfw.head(10))
print("With path components:")
try:
df_components = dfw.add_path_components()
print(df_components.head(10))
except Exception as e:
print("add_path_components failed:", e)
print("With file stats:")
try:
df_stats = dfw.add_file_stats_cols()
print(df_stats.head(10))
except Exception as e:
print("add_file_stats_cols failed:", e)
print("Add depth column relative to repo root:")
try:
df_depth = dfw.add_depth_col(Path("."))
print(df_depth.head(10))
except Exception as e:
print("add_depth_col failed:", e)
3) Build a DataFrame from a directory using probe_to_df¶
This uses filoma's convenience probe_to_df which will build a Polars DataFrame if polars is installed. We request a lightweight folder under tests/ to keep runtime small.
from filoma import probe_to_df
dir_path = "../tests"
if dir_path is None:
print("No test directory available for probe_to_df; skip this cell.")
else:
try:
pl_df = probe_to_df(dir_path, to_pandas=False, enrich=True, max_depth=2, threads=2)
print("probe_to_df returned a Polars DataFrame with shape:", pl_df.shape)
# Show a small sample and a group_by_extension summary when available
try:
print("Sample rows:")
print(pl_df.head(5))
except Exception:
pass
try:
print("Extension counts:")
# wrap it in a DataFrame wrapper if needed
from filoma import DataFrame as DFWrap
wrapper = DFWrap(pl_df)
print(wrapper.group_by_extension().head(10))
except Exception as e:
print("group_by_extension failed:", e)
except Exception as e:
print("probe_to_df failed:", e)
4) Image probing (in-memory)¶
Create a small numpy array and pass it to filoma.probe_image to exercise the image path that accepts arrays. This avoids needing image files or heavy dependencies.
try:
import numpy as np
arr = np.random.randn(16, 16)
img_report = filoma.probe_image(arr)
print("probe_image on numpy array returned type:", type(img_report))
try:
print(img_report)
except Exception:
pass
except Exception as e:
print("Skipping image probe; numpy unavailable or probe failed:", e)
5) Save a small CSV export (if polars is available)¶
This cell attempts to save the probe_to_df result or our small DataFrame example to /tmp/filoma_example.csv. It prints a short verification sample.
out_path = Path("/tmp/filoma_example.csv")
saved = False
try:
if "pl_df" in globals():
# Try write via polars if present
try:
pl_df.write_csv(str(out_path))
saved = True
except Exception:
pass
if not saved and "dfw" in globals():
try:
df_stats.save_csv(out_path)
saved = True
except Exception:
pass
if saved:
print("Saved CSV to", out_path)
try:
print("CSV sample:", out_path.read_text().splitlines()[:10])
except Exception:
pass
else:
print("Could not save CSV; polars or file-writer not available.")
except Exception as e:
print("Saving CSV failed:", e)
Notes and next steps¶
- If a cell raised an exception because a dependency is missing, install
polars,numpy, and optionallypillow. - To run longer scans increase
max_depthandthreadsin theprobe()calls. - Use
probe_to_df(..., to_pandas=True)to get a pandas.DataFrame if you prefer pandas.