Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
50 most recent check-ins
2025-09-01
| ||
23:11 | Refactor so that sampling of documents for retrieval is handled consistently Leaf check-in: 699394002f user: sam_hames tags: refactor-all | |
22:54 | Start rendering matching docs in the background as soon as possible check-in: 41fabe6da9 user: sam_hames tags: refactor-all | |
05:38 | Change the corpus interface so that the docs method is the entry point to iterators of docs, and the transformation functions only apply to a single doc at a time. check-in: efdcda7663 user: sam_hames tags: refactor-all | |
03:47 | Bugfix: make sure to check matching against all range features provided - only one needs to match to be True check-in: ea410b6f40 user: sam_hames tags: refactor-all | |
2025-08-28
| ||
05:50 | Add utility tools for rearranging the output of doc_features. This will support future interfaces for concordance and passage/snippet display. check-in: aa53088521 user: sam_hames tags: refactor-all | |
2025-08-26
| ||
01:02 | Display range features correctly in the web interface check-in: d6f13727c1 user: sam_hames tags: refactor-all | |
01:01 | Bugfix: range query on range encoded feature was not working with an exact lower bound check-in: 6c1a5a8896 user: sam_hames tags: refactor-all | |
2025-08-25
| ||
00:53 | Make sure IndexPlugin's post_index_rebuild method is actually called check-in: d916995653 user: sam_hames tags: refactor-all | |
2025-08-24
| ||
23:27 | Refactor so that random_state is a property of HyperrealIndex and can be used across all plugins check-in: 877c46c5f6 user: sam_hames tags: refactor-all | |
2025-08-15
| ||
02:48 | Make sure test data is actually checked in check-in: 414e7f837c user: sam_hames tags: refactor-all | |
2025-08-14
| ||
05:56 | Don't autoreload the webserver check-in: 4de037fe2a user: sam_hames tags: refactor-all | |
2025-08-13
| ||
22:55 | Bugfix: make sure the selected cluster is always in the list of clusters for feature similarity calculations check-in: 4c4dfcfcde user: sam_hames tags: refactor-all | |
22:30 | Web performance: by default only compute and show the similar features from the top_k_clusters check-in: 623be93808 user: sam_hames tags: refactor-all | |
21:42 | Add an option to cluster using passages rather than documents as the indicator check-in: 4d60fa1619 user: sam_hames tags: refactor-all | |
11:27 | Update newsgroups example to better handle repeated text at the paragraph level check-in: e981492155 user: sam_hames tags: refactor-all | |
11:27 | Insert/update features in clustering in sorted order for reproducibility (python sets are not guaranteed to preserve order check-in: 1b743ae1ac user: sam_hames tags: refactor-all | |
03:47 | Make feature_clustering using a fixed random_state object, that can be directly set by the user for repeatable randomness over a series of operations check-in: 48bac0160b user: sam_hames tags: refactor-all | |
01:42 | Tweak urls and highlight-features for the cluster drill-down view check-in: 46d668e44b user: sam_hames tags: refactor-all | |
01:31 | Make feature clusters editable through the web UI check-in: 37491ee032 user: sam_hames tags: refactor-all | |
2025-08-12
| ||
07:39 | Bugfix: not passing through max_workers correctly to index.rebuild check-in: e45dfbefec user: sam_hames tags: refactor-all | |
05:51 | Experimental interface to make the features used in a query available to a corpus renderer for highlighting of results - this is a WIP and will probably change check-in: a482c62dd5 user: sam_hames tags: refactor-all | |
2025-08-11
| ||
23:40 | Add footer links to drilldown into the selected cluster view check-in: a48bc2c3ca user: sam_hames tags: refactor-all | |
22:45 | Refine display and alignment of quantities with headers check-in: cfeceb92d6 user: sam_hames tags: refactor-all | |
03:28 | Refine layout and styling by enabling different sized columns, and showing both doc_counts and similarity marks aligned check-in: 2bc9befd80 user: sam_hames tags: refactor-all | |
2025-08-10
| ||
02:12 | Show number of sampled and matching documents in header for matches check-in: d2b879ef66 user: sam_hames tags: refactor-all | |
02:12 | Handle quoted lines consistently with new helper method check-in: 61b7ad181e user: sam_hames tags: refactor-all | |
01:27 | Try a different CSS styling for a more compact display that uses more screenspace when available check-in: 039f273d79 user: sam_hames tags: refactor-all | |
01:26 | Add navigation to next and previous clusters for the cluster drilldown view check-in: 28e3b54bde user: sam_hames tags: refactor-all | |
2025-08-08
| ||
07:37 | Add an additional endpoint to handle drilling down into the full detail of a cluster - this displays the features in the selected cluster next to all other clusters, sorted by similarity and enables more specific drilldown. check-in: 5f1115dc48 user: sam_hames tags: refactor-all | |
2025-08-07
| ||
06:39 | Calculate cluster facets in parallel using the background pool. Refactor how the db state is managed by ensuring it is always accessed through the HyperrealIndex object, rather than the previous convenience assignment - this assignment prevents a feature cluster plugin from being picklable/usable in a background pool. check-in: 0b16c7b9a0 user: sam_hames tags: refactor-all | |
05:49 | On seconds thoughts, don't change the objective, just leave it how it is and come back to it later in a more systematic way check-in: a9cc808915 user: sam_hames tags: refactor-all | |
01:52 | Display a sample of all documents on the cluster browser view with no selected feature/cluster check-in: 2241285f61 user: sam_hames tags: refactor-all | |
01:38 | Web related fixes: pass through extra_css from the corpus, list docs properly in the cluster browser, also enable selecting the port to serve on check-in: 3476bcde9c user: sam_hames tags: refactor-all | |
01:35 | Change algorithm to penalise according to the minimum of n_docs and hits - this will more heavily penalise clusters with a large number of long features. check-in: 30257ab088 user: sam_hames tags: refactor-all | |
2025-08-04
| ||
06:50 | Add split_cluster_into functionality on the feature clustering check-in: 60e63f3366 user: sam_hames tags: refactor-all | |
05:50 | Cleanup lint warnings for feature_cluster.py check-in: 64b8c80f3f user: sam_hames tags: refactor-all | |
05:33 | Apply/lint formatting for import order check-in: 8820fc6831 user: sam_hames tags: refactor-all | |
05:31 | Add new functionality for merging clusters, add test cases for cluster level operations, also fix a transaction management bug raised by doing this work check-in: 728c284789 user: sam_hames tags: refactor-all | |
02:35 | Tidy up the docstrings for the ValueHandler check-in: 2f244af160 user: sam_hames tags: refactor-all | |
2025-07-30
| ||
03:53 | Change handler interface so they have access to the corpus check-in: 1995e12ca2 user: sam_hames tags: refactor-all | |
2025-06-06
| ||
06:54 | Add stub configuration for generating documentation using sphinx. This enables all-in-one-page HTML generation, includes version info that will help align it with fossil based publishing and linking, and also the facilities for including selected API docs are what I want. check-in: 81cdfc1b93 user: sam_hames tags: refactor-all | |
2025-06-04
| ||
04:16 | Improve handling of quoted components in body of posts, handle jupyter notebook and script invocation of the server from the source notebook, CSS tweak, remove html_indexable_doc interface from the corpus check-in: 15d281e374 user: sam_hames tags: refactor-all | |
2025-04-25
| ||
05:19 | Allow choosing top_k features from field_features check-in: fb7dec78e0 user: sam_hames tags: refactor-all | |
05:17 | Allow pivoting by a range query in the web UI check-in: 85d57e5153 user: sam_hames tags: refactor-all | |
2025-04-18
| ||
12:14 | Bugfix: handling range features with no lower bound correctly check-in: 56866bf229 user: sam_hames tags: refactor-all | |
2025-04-08
| ||
04:26 | Flesh out the facted visualisation further to display more than one thing, and consistently with the clustering display check-in: 90c0b23fcb user: sam_hames tags: refactor-all | |
04:25 | Bugfix: incorrectly handling the edge case for the edge of a range encoded literal value check-in: 240a0acd0e user: sam_hames tags: refactor-all | |
2025-04-07
| ||
10:51 | Make the facets linked for filtering and drilling down check-in: c20acfc49c user: sam_hames tags: refactor-all | |
05:50 | Randomly sample documents for display, refine filtering interface/helpers, and spike out a facet rendering view check-in: 47123c4ebb user: sam_hames tags: refactor-all | |
2025-04-06
| ||
22:55 | Randomly subsample documents for display check-in: 0651ec0816 user: sam_hames tags: refactor-all | |