ibeis.algo.hots.smk package¶

Submodules¶

ibeis.algo.hots.smk.smk_core module¶

smk core

ibeis.algo.hots.smk.smk_core.accumulate_scores(dscores_list, daids_list)[source]¶: helper to accumulate grouped scores for database annotations

ibeis.algo.hots.smk.smk_core.build_correspondences(sparse_list, qfxs_list, dfxs_list, daids_list)[source]¶: helper these list comprehensions replace the prevous for loop they still need to be optimized a little bit (and made clearer) can probably unnest the list comprehensions as well

ibeis.algo.hots.smk.smk_core.build_daid2_chipmatch2(invindex, common_wxs, wx2_qaids, wx2_qfxs, scores_list, daids_list, query_sccw)[source]¶

Builds explicit chipmatches that the rest of the pipeline plays nice with

Notation:

An explicit cmtup_old is a tuple (fm, fs, fk) feature_matches, feature_scores, and feature_ranks.

Let N be the number of matches

A feature match, fm{shape=(N, 2), dtype=int32}, is an array where the first column corresponds to query_feature_indexes (qfx) and the second column corresponds to database_feature_indexes (dfx).

A feature score, fs{shape=(N,), dtype=float64} is an array of scores

A feature rank, fk{shape=(N,), dtype=int16} is an array of ranks

Returns:	daid2_chipmatch – (daid2_fm, daid2_fs, daid2_fk) Return Format: daid2_fm (dict): {daid: fm, ...} daid2_fs (dict): {daid: fs, ...} daid2_fk (dict): {daid: fk, ...}
Return type:	dict

Example

>>> from ibeis.algo.hots.smk.smk_core import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> ibs, invindex, qindex, qparams = smk_debug.testdata_match_kernel_L2()
>>> wx2_qrvecs, wx2_qmaws, wx2_qaids, wx2_qfxs, query_sccw = qindex
>>> smk_alpha = ibs.cfg.query_cfg.smk_cfg.smk_alpha
>>> smk_thresh = ibs.cfg.query_cfg.smk_cfg.smk_thresh
>>> withinfo = True  # takes an 11s vs 2s
>>> args = (wx2_qrvecs, wx2_qmaws, wx2_qaids, wx2_qfxs, query_sccw, invindex, withinfo, smk_alpha, smk_thresh)
>>> retL1 =  match_kernel_L1(*args)
>>> (daid2_totalscore, common_wxs, scores_list, daids_list, idf_list, daid_agg_keys,)  = retL1
>>> daid2_chipmatch_old = build_daid2_chipmatch2(invindex, common_wxs, wx2_qaids, wx2_qfxs, scores_list, daids_list, query_sccw)
>>> daid2_chipmatch_new = build_daid2_chipmatch3(invindex, common_wxs, wx2_qaids, wx2_qfxs, scores_list, daids_list, query_sccw)
>>> print(utool.is_dicteq(daid2_chipmatch_old[0], daid2_chipmatch_new[0]))
>>> print(utool.is_dicteq(daid2_chipmatch_old[2], daid2_chipmatch_new[2]))
>>> print(utool.is_dicteq(daid2_chipmatch_old[1],  daid2_chipmatch_new[1]))

%timeit build_daid2_chipmatch2(invindex, common_wxs, wx2_qaids, wx2_qfxs, scores_list, daids_list, query_sccw) %timeit build_daid2_chipmatch3(invindex, common_wxs, wx2_qaids, wx2_qfxs, scores_list, daids_list, query_sccw)

ibeis.algo.hots.smk.smk_core.build_daid2_chipmatch3(qindex, invindex, common_wxs, scores_list, daids_list)[source]¶

Parameters:	invindex (InvertedIndex) – object for fast vocab lookup common_wxs (list) – list of word intersections wx2_qfxs (dict) – scores_list (list) – daids_list (list) – query_sccw (float) – query self-consistency-criterion
Returns:	daid2_chipmatch

Example

>>> from ibeis.algo.hots.smk.smk_core import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> ibs, invindex, qindex, qparams = smk_debug.testdata_match_kernel_L2(aggregate=True)
>>> args = (qindex, invindex, qparms)
>>> retL1 =  match_kernel_L1(*args)
>>> (daid2_totalscore, common_wxs, scores_list, daids_list, idf_list, daid_agg_keys,)  = retL1
>>> daid2_chipmatch_new = build_daid2_chipmatch3(invindex, common_wxs, wx2_qfxs, scores_list, daids_list, query_sccw)
>>> daid2_chipmatch_old = build_daid2_chipmatch2(invindex, common_wxs, wx2_qfxs, scores_list, daids_list, query_sccw)
>>> print(utool.is_dicteq(daid2_chipmatch_old[0], daid2_chipmatch_new[0]))
>>> print(utool.is_dicteq(daid2_chipmatch_old[2], daid2_chipmatch_new[2]))
>>> print(utool.is_dicteq(daid2_chipmatch_old[1], daid2_chipmatch_new[1]))

Notation::: The Format of Feature Index Lists are: fxs_list ~ [ ... list_per_word ... ] list_per_word ~ [ ... list_per_rvec ... ] list_per_rvec ~ [ features contributing to rvec (only one if agg=False)]

ibeis.algo.hots.smk.smk_core.flatten_correspondences(fm_nestlist, fs_nestlist, daid_nestlist, query_sccw)[source]¶: helper

ibeis.algo.hots.smk.smk_core.group_correspondences(all_matches, all_scores, all_daids, daid2_sccw)[source]¶

ibeis.algo.hots.smk.smk_core.match_kernel_L0(qrvecs_list, drvecs_list, qflags_list, dflags_list, qmaws_list, dmaws_list, smk_alpha, smk_thresh, idf_list, daids_list, daid2_sccw, query_sccw)[source]¶

Computes smk kernels

Parameters:	qrvecs_list (list) – drvecs_list (list) – qflags_list (list) – dflags_list (list) – qmaws_list (list) – dmaws_list (list) – smk_alpha (float) – selectivity power smk_thresh (float) – selectivity threshold idf_list (list) – daids_list (list) – daid2_sccw (dict) – query_sccw (float) – query self-consistency-criterion
Returns:	(daid2_totalscore, scores_list, daid_agg_keys,)
Return type:	retL0

CommandLine:

python -m ibeis.algo.hots.smk.smk_core --test-match_kernel_L0

Example

>>> # DISABLE_DOCTEST
>>> from ibeis.algo.hots.smk.smk_core import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> #smk_debug.rrr()
>>> core1, core2, extra = smk_debug.testdata_match_kernel_L0()
>>> smk_alpha, smk_thresh, query_sccw, daids_list, daid2_sccw = core1
>>> qrvecs_list, drvecs_list, qmaws_list, dmaws_list, idf_list = core2
>>> qaid2_sccw, qaids_list = extra
>>> retL0 = match_kernel_L0(qrvecs_list, drvecs_list, qmaws_list, dmaws_list, smk_alpha, smk_thresh, idf_list, daids_list, daid2_sccw, query_sccw)
>>> # Test Asymetric Matching
>>> (daid2_totalscore, scores_list, daid_agg_keys,) = retL0
>>> print(daid2_totalscore[5])
0.336434201301
>>> # Test Self Consistency
>>> qret = match_kernel_L0(qrvecs_list, qrvecs_list, qmaws_list, qmaws_list, smk_alpha, smk_thresh, idf_list, qaids_list, qaid2_sccw, query_sccw)
>>> (qaid2_totalscore, qscores_list, qaid_agg_keys,) = qret
>>> print(qaid2_totalscore[42])
1.0000000000000007

ibeis.algo.hots.smk.smk_core.match_kernel_L1(qindex, invindex, qparams)[source]¶: Builds up information and does verbosity before going to L0

ibeis.algo.hots.smk.smk_core.match_kernel_L2(qindex, invindex, qparams, withinfo=True)[source]¶

Example

>>> from ibeis.algo.hots.smk.smk_core import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> ibs, invindex, qindex, qparams = smk_debug.testdata_match_kernel_L2()
>>> withinfo = True  # takes an 11s vs 2s
>>> smk_debug.rrr()
>>> smk_debug.invindex_dbgstr(invindex)
>>> daid2_totalscore, daid2_wx2_scoremat = match_kernel_L2(qindex, invindex, qparams, withinfo)

ibeis.algo.hots.smk.smk_core.mem_arange(num, cache={})[source]¶

ibeis.algo.hots.smk.smk_core.mem_meshgrid(wrange, hrange, cache={})[source]¶

ibeis.algo.hots.smk.smk_debug module¶

TODO:

Define easy to use classes functions for the following concepts

Vocabulary / Dictionary / Codebook -: centroids that partition descriptor space. Many methods can be used to define a vocabulary. The simplest technique is K-Means clustering. Other learning algorithms can be implemented.
Vocabulary Quantizer -: Quantizes / Codes a raw descriptor by assigning it to one or more visual words with assignment weights. Can be implemented using simple approximate nearest neighbor to centroids or via tree partitioning or some other method.
Inverted Index -: Uses the vocabulary to index quantized descriptors. Needs add / remove methods that add and remove images as sets of descriptors. Needs to update the vocabulary and recompute any internal image representations. Needs to encode individual images or subimages.
Vocabulary Matching -: Uses the inverted index to match individual or aggregated features between query and database images

ibeis.algo.hots.smk.smk_debug.assert_single_assigned_maws(maws_list)[source]¶

ibeis.algo.hots.smk.smk_debug.check_daid2_chipmatch(daid2_chipmatch, verbose=True)[source]¶

ibeis.algo.hots.smk.smk_debug.check_daid2_sccw(daid2_sccw, verbose=True)[source]¶

ibeis.algo.hots.smk.smk_debug.check_data_smksumm(aididf_list, aidrvecs_list)[source]¶

ibeis.algo.hots.smk.smk_debug.check_dtype(annots_df)[source]¶

Example

>>> from ibeis.algo.hots.smk.smk_index import *  # NOQA
>>> import ibeis
>>> ibs = ibeis.opendb('PZ_MTEST')
>>> annots_df = make_annot_df(ibs)

ibeis.algo.hots.smk.smk_debug.check_invindex(invindex, verbose=True)[source]¶

Example

>>> from ibeis.algo.hots.smk import smk_index
>>> from ibeis.algo.hots.smk import smk_debug
>>> ibs, annots_df, taids, daids, qaids, qreq_, nWords = smk_debug.testdata_dataframe()
>>> words = smk_index.learn_visual_words(annots_df, taids, nWords)
>>> qparams = qreq_.qparams
>>> invindex = smk_repr.index_data_annots(annots_df, daids, words, qparams)

ibeis.algo.hots.smk.smk_debug.check_invindex_wx2(invindex)[source]¶

ibeis.algo.hots.smk.smk_debug.check_qaid2_chipmatch(qaid2_chipmatch, qaids, verbose=True)[source]¶

ibeis.algo.hots.smk.smk_debug.check_rvecs_list_eq(rvecs_list, rvecs_list2)[source]¶

Example

>>> rvecs_list, flag_list = smk_residual.compute_nonagg_rvecs(*_args1)  # 125 ms
>>> rvecs_list2 = smk_speed.compute_nonagg_residuals_forloop(*_args1)

ibeis.algo.hots.smk.smk_debug.check_wx2(words=None, wx2_rvecs=None, wx2_aids=None, wx2_fxs=None)[source]¶: provides debug info for mappings from word indexes to values

ibeis.algo.hots.smk.smk_debug.check_wx2_idxs(wx2_idxs, nWords)[source]¶

ibeis.algo.hots.smk.smk_debug.check_wx2_rvecs(wx2_rvecs, verbose=True)[source]¶

ibeis.algo.hots.smk.smk_debug.check_wx2_rvecs2(invindex, wx2_rvecs=None, wx2_idxs=None, idx2_vec=None, verbose=True)[source]¶

ibeis.algo.hots.smk.smk_debug.convert_smkmatch_to_chipmatch(qaid2_chipmatch, qaid2_scores)[source]¶: function to fix oldstyle chipmatches into newstyle that is accepted by the pipeline

ibeis.algo.hots.smk.smk_debug.dbstr_qindex(qindex_=None)[source]¶

ibeis.algo.hots.smk.smk_debug.dictinfo(dict_)[source]¶

ibeis.algo.hots.smk.smk_debug.display_info(ibs, invindex, annots_df)[source]¶

ibeis.algo.hots.smk.smk_debug.get_test_float_norm_rvecs(num=1000, dim=None, rng=<module 'numpy.random' from '/usr/local/lib/python2.7/dist-packages/numpy/random/__init__.pyc'>)[source]¶

ibeis.algo.hots.smk.smk_debug.get_test_maws(rvecs, rng=<module 'numpy.random' from '/usr/local/lib/python2.7/dist-packages/numpy/random/__init__.pyc'>)[source]¶

ibeis.algo.hots.smk.smk_debug.get_test_rvecs(num=1000, dim=None, nanrows=None, rng=<module 'numpy.random' from '/usr/local/lib/python2.7/dist-packages/numpy/random/__init__.pyc'>)[source]¶

ibeis.algo.hots.smk.smk_debug.invindex_dbgstr(invindex)[source]¶

>>> from ibeis.algo.hots.smk.smk_debug import *  # NOQA
>>> ibs, annots_df, daids, qaids, invindex = testdata_raw_internals0()
>>> invindex_dbgstr(invindex)

ibeis.algo.hots.smk.smk_debug.main_smk_debug()[source]¶

CommandLine:

python -m ibeis.algo.hots.smk.smk_debug --test-main_smk_debug

Example

>>> from ibeis.algo.hots.smk.smk_debug import *  # NOQA
>>> main_smk_debug()

ibeis.algo.hots.smk.smk_debug.query_smk_test(annots_df, invindex, qreq_)[source]¶

ibeis interface .. rubric:: Example

>>> from ibeis.algo.hots.smk import smk_match
>>> from ibeis.algo.hots.smk import smk_debug
>>> ibs, annots_df, daids, qaids, invindex, qreq_ = smk_debug.testdata_internals_full()
>>> qaid2_qres_ = smk_match.query_smk(annots_df, invindex, qreq_)

Dev::: qres = qaid2_qres_[qaids[0]] fig = qres.show_top(ibs)

ibeis.algo.hots.smk.smk_debug.sift_stats()[source]¶

ibeis.algo.hots.smk.smk_debug.test_sccw_cache()[source]¶

ibeis.algo.hots.smk.smk_debug.testdata_apply_weights()[source]¶

ibeis.algo.hots.smk.smk_debug.testdata_compute_data_sccw(**kwargs)[source]¶

Example

>>> from ibeis.algo.hots.smk.smk_debug import *  # NOQA

ibeis.algo.hots.smk.smk_debug.testdata_dataframe(cfgdict=None, **kwargs)[source]¶

ibeis.algo.hots.smk.smk_debug.testdata_ibeis(**kwargs)[source]¶

DEPRICATE Step 1

builds ibs for testing

Example

>>> from ibeis.algo.hots.smk.smk_debug import *  # NOQA
>>> kwargs = {}

ibeis.algo.hots.smk.smk_debug.testdata_ibeis2(cfgdict=None, **kwargs)[source]¶

Step 2

selects training and test set

Example

>>> from ibeis.algo.hots.smk.smk_debug import *  # NOQA
>>> kwargs = {}

ibeis.algo.hots.smk.smk_debug.testdata_internals_full(delete_rawvecs=True, **kwargs)[source]¶

Example

>>> from ibeis.algo.hots.smk.smk_debug import *  # NOQA
>>> kwargs = {}

ibeis.algo.hots.smk.smk_debug.testdata_match_kernel_L0()[source]¶

ibeis.algo.hots.smk.smk_debug.testdata_match_kernel_L2(**kwargs)[source]¶

Example

>>> from ibeis.algo.hots.smk.smk_debug import *  # NOQA

ibeis.algo.hots.smk.smk_debug.testdata_nonagg_rvec()[source]¶

ibeis.algo.hots.smk.smk_debug.testdata_query_repr(**kwargs)[source]¶

Example

>>> from ibeis.algo.hots.smk.smk_debug import *  # NOQA

ibeis.algo.hots.smk.smk_debug.testdata_raw_internals0(**kwargs)[source]¶

ibeis.algo.hots.smk.smk_debug.testdata_raw_internals1(**kwargs)[source]¶

ibeis.algo.hots.smk.smk_debug.testdata_raw_internals1_5(**kwargs)[source]¶

contains internal data up to idf weights

Example

>>> from ibeis.algo.hots.smk.smk_debug import *  # NOQA

ibeis.algo.hots.smk.smk_debug.testdata_sccw_sum(**kwargs)[source]¶

ibeis.algo.hots.smk.smk_debug.testdata_selectivity_function()[source]¶

ibeis.algo.hots.smk.smk_debug.testdata_similarity_function()[source]¶

ibeis.algo.hots.smk.smk_debug.testdata_words(**kwargs)[source]¶

ibeis.algo.hots.smk.smk_debug.vector_normal_stats(vectors)[source]¶

ibeis.algo.hots.smk.smk_debug.vector_stats(vectors, name, verbose=True)[source]¶

ibeis.algo.hots.smk.smk_debug.wx_len_stats(wx2_xxx)[source]¶

Example

>>> from ibeis.algo.hots.smk.smk_debug import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> from ibeis.algo.hots.smk import smk_repr
>>> ibs, annots_df, taids, daids, qaids, qreq_, nWords = smk_debug.testdata_dataframe()
>>> qreq_ = query_request.new_ibeis_query_request(ibs, qaids, daids)
>>> qparams = qreq_.qparams
>>> invindex = smk_repr.index_data_annots(annots_df, daids, words)
>>> qaid = qaids[0]
>>> wx2_qrvecs, wx2_qaids, wx2_qfxs, query_sccw = smk_repr.new_qindex(annots_df, qaid, invindex, qparams)
>>> print(ut.dict_str(wx2_rvecs_stats(wx2_qrvecs)))

ibeis.algo.hots.smk.smk_hamming module¶

ibeis.algo.hots.smk.smk_hamming.get_vecs_hamming_encoding(vecs)[source]¶

Parameters:	vecs (ndarray) – descriptors assigned to a single word P (ndarray) – random orthognoal projection matrix for that word

Exmaple:

>>> np.random.seed(0)
>>> vecs = np.random.rand(10, 128)

ibeis.algo.hots.smk.smk_hamming.make_projetion_matrix(vec_dim)[source]¶

ibeis.algo.hots.smk.smk_index module¶

smk_index This module contains functions for the SelectiveMatchKernels’s inverted index.

TODO::

Test suit 1000k images
Extend for SMK with labels
Test get numbers and refine
Extrnal keypoint specific weighting

ibeis.algo.hots.smk.smk_index.OLD_compute_data_sccw_(idx2_daid, wx2_drvecs, wx2_aids, wx2_idf, wx2_dmaws, smk_alpha, smk_thresh, verbose=False)[source]¶

ibeis.algo.hots.smk.smk_index.assign_to_words_(wordflann, words, idx2_vec, nAssign, massign_alpha, massign_sigma, massign_equal_weights)[source]¶

Assigns descriptor-vectors to nearest word.

Parameters:

wordflann (FLANN) – nearest neighbor index over words
words (ndarray) – vocabulary words
idx2_vec (ndarray) – descriptors to assign
nAssign (int) – number of words to assign each descriptor to
massign_alpha (float) – multiple-assignment ratio threshold
massign_sigma (float) – multiple-assignment gaussian variance
massign_equal_weights (bool) – assign equal weight to all multiassigned words

Returns:

inverted index, multi-assigned weights, and forward index formated as:

* wx2_idxs - word index   -> vector indexes
* wx2_maws - word index   -> multi-assignment weights
* idf2_wxs - vector index -> assigned word indexes

Return type:

tuple

Example

>>> # SLOW_DOCTEST
>>> from ibeis.algo.hots.smk.smk_index import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> ibs, annots_df, daids, qaids, invindex, qreq_ = smk_debug.testdata_raw_internals0()
>>> words  = invindex.words
>>> wordflann = invindex.wordflann
>>> idx2_vec  = invindex.idx2_dvec
>>> nAssign = qreq_.qparams.nAssign
>>> massign_alpha = qreq_.qparams.massign_alpha
>>> massign_sigma = qreq_.qparams.massign_sigma
>>> massign_equal_weights = qreq_.qparams.massign_equal_weights
>>> _dbargs = (wordflann, words, idx2_vec, nAssign, massign_alpha, massign_sigma, massign_equal_weights)
>>> wx2_idxs, wx2_maws, idx2_wxs = assign_to_words_(*_dbargs)

ibeis.algo.hots.smk.smk_index.compute_data_sccw_(idx2_daid, wx2_drvecs, wx2_dflags, wx2_aids, wx2_idf, wx2_dmaws, smk_alpha, smk_thresh, verbose=False)[source]¶

Computes sccw normalization scalar for the database annotations. This is gamma from the SMK paper. sccw is a self consistency critiron weight — a scalar which ensures the score of K(X, X) = 1

Parameters:	() (smk_thresh) – () – () – () – () – () – () –
Returns:	daid2_sccw

Example

>>> # SLOW_DOCTEST
>>> from ibeis.algo.hots.smk.smk_index import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_index
>>> from ibeis.algo.hots.smk import smk_debug
>>> #tup = smk_debug.testdata_compute_data_sccw(db='testdb1')
>>> tup = smk_debug.testdata_compute_data_sccw(db='PZ_MTEST')
>>> ibs, annots_df, invindex, wx2_idxs, wx2_idf, wx2_drvecs, wx2_aids, qparams = tup
>>> wx2_dflags = invindex.wx2_dflags
>>> ws2_idxs = invindex.wx2_idxs
>>> wx2_dmaws  = invindex.wx2_dmaws
>>> idx2_daid  = invindex.idx2_daid
>>> daids      = invindex.daids
>>> smk_alpha  = qparams.smk_alpha
>>> smk_thresh = qparams.smk_thresh
>>> wx2_idf    = wx2_idf
>>> verbose = True
>>> invindex.invindex_dbgstr()
>>> invindex.report_memory()
>>> invindex.report_memsize()
>>> daid2_sccw = smk_index.compute_data_sccw_(idx2_daid, wx2_drvecs, wx2_dflags, wx2_aids, wx2_idf, wx2_dmaws, smk_alpha, smk_thresh, verbose)

ibeis.algo.hots.smk.smk_index.compute_idf_label1(aids_list, daid2_label)[source]¶

One of our idf extensions

Example

>>> from ibeis.algo.hots.smk.smk_index import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> ibs, annots_df, daids, qaids, invindex, wx2_idxs, qparams = smk_debug.testdata_raw_internals1()
>>> wx_series = np.arange(len(invindex.words))
>>> idx2_aid = invindex.idx2_daid
>>> daid2_label = invindex.daid2_label
>>> _ = helper_idf_wordgroup(wx2_idxs, idx2_aid, wx_series)
>>> idxs_list, aids_list = _
>>> wx2_idf = compute_idf_label1(wx_series, wx2_idxs, idx2_aid, daids)

ibeis.algo.hots.smk.smk_index.compute_idf_orig(aids_list, daids)[source]¶: The standard tried and true idf measure

ibeis.algo.hots.smk.smk_index.compute_multiassign_weights_(_idx2_wx, _idx2_wdist, massign_alpha, massign_sigma, massign_equal_weights)[source]¶

Multi Assignment Filtering from Improving Bag of Features

Parameters:	() (massign_equal_weights) – () – () – () – () – Turns off soft weighting. Gives all assigned vectors weight 1
Returns:	(idx2_wxs, idx2_maws)
Return type:	tuple

References

(Improving Bag of Features) http://lear.inrialpes.fr/pubs/2010/JDS10a/jegou_improvingbof_preprint.pdf

(Lost in Quantization) http://www.robots.ox.ac.uk/~vgg/publications/papers/philbin08.ps.gz

(A Context Dissimilarity Measure for Accurate and Efficient Image Search) https://lear.inrialpes.fr/pubs/2007/JHS07/jegou_cdm.pdf

Notes

sigma values from cite{philbin_lost08} (70 ** 2) ~= 5000, (80 ** 2) ~= 6250, (86 ** 2) ~= 7500,

Auto:: from ibeis.algo.hots.smk import smk_index import utool as ut; print(ut.make_default_docstr(smk_index.compute_multiassign_weights_))

ibeis.algo.hots.smk.smk_index.compute_negentropy_names(aids_list, daid2_label)[source]¶

One of our idf extensions Word weighting based on the negative entropy over all names of p(n_i | word)

Parameters:	aids_list (list of aids) – daid2_label (dict from daid to label) –
Returns:	negentropy_list – idf-like weighting for each word based on the negative entropy
Return type:	ndarray[float32]

Example

>>> # SLOW_DOCTEST
>>> from ibeis.algo.hots.smk.smk_index import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> ibs, annots_df, daids, qaids, invindex, wx2_idxs, qparams = smk_debug.testdata_raw_internals1()
>>> wx_series = np.arange(len(invindex.words))
>>> idx2_aid = invindex.idx2_daid
>>> daid2_label = invindex.daid2_label
>>> _ = helper_idf_wordgroup(wx2_idxs, idx2_aid, wx_series)
>>> idxs_list, aids_list = _

Math::

p(n_i | word) = sum_{lbl in L_i} p(lbl | word)

p(lbl | word) = frac{p(word | lbl) p(lbl)}{p(word)}

p(word) = sum_{lbl’ in L} p(word | lbl’) p(lbl’)

p(word | lbl) = NumAnnotOfLabelWithWord / NumAnnotWithLabel = frac{sum_{X in DB_lbl} b(word, X)}{card{DB_lbl}}

h(n_i | word) = -sum_{i=1}^N p(n_i | word) log p(n_i | word)

word_weight = log(N) - h(n | word)

CommandLine:

python dev.py -t smk2 --allgt --db GZ_ALL
python dev.py -t smk5 --allgt --db GZ_ALL

Auto:: python -c “import utool as ut; ut.print_auto_docstr(‘ibeis.algo.hots.smk.smk_index’, ‘compute_negentropy_names’)”

ibeis.algo.hots.smk.smk_index.compute_residuals_(words, wx2_idxs, wx2_maws, idx2_vec, idx2_aid, idx2_fx, aggregate, verbose=False)[source]¶

Computes residual vectors based on word assignments returns mapping from word index to a set of residual vectors

Parameters:

words (ndarray) –
wx2_idxs (dict) –
wx2_maws (dict) –
idx2_vec (dict) –
idx2_aid (dict) –
idx2_fx (dict) –
aggregate (bool) –
verbose (bool) –

Returns:

(wx2_rvecs, wx2_aids, wx2_fxs, wx2_maws) formatted as::

wx2_rvecs - [ ... [ rvec_i1, ..., rvec_Mi ]_i ... ]
wx2_aids - [ ... [ aid_i1, ..., aid_Mi ]_i ... ]
wx2_fxs - [ ... [[fxs]_i1, ..., [fxs]_Mi ]_i ... ]

For every word:

* list of aggvecs
* For every aggvec:
    * one parent aid, if aggregate is False: assert isunique(aids)
    * list of parent fxs, if aggregate is True: assert len(fxs) == 1

Return type:

tuple

Example

>>> # SLOW_DOCTEST
>>> from ibeis.algo.hots.smk.smk_index import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> ibs, annots_df, daids, qaids, invindex, wx2_idxs, qparams = smk_debug.testdata_raw_internals1()
>>> words     = invindex.words
>>> idx2_aid  = invindex.idx2_daid
>>> idx2_fx   = invindex.idx2_dfx
>>> idx2_vec  = invindex.idx2_dvec
>>> aggregate = ibs.cfg.query_cfg.smk_cfg.aggregate
>>> wx2_rvecs, wx2_aids, wx2_fxs, wx2_maws, wx2_flags = compute_residuals_(words, wx2_idxs, wx2_maws, idx2_vec, idx2_aid, idx2_fx, aggregate)

ibeis.algo.hots.smk.smk_index.compute_word_idf_(wx_series, wx2_idxs, idx2_aid, daids, daid2_label=None, vocab_weighting='idf', verbose=False)[source]¶

Computes the inverse-document-frequency weighting for each word

Parameters:	() (vocab_weighting) – () – () – () – () – () –
Returns:	wx2_idf

Example

>>> # SLOW_DOCTEST
>>> from ibeis.algo.hots.smk.smk_index import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> ibs, annots_df, daids, qaids, invindex, wx2_idxs, qparams = smk_debug.testdata_raw_internals1()
>>> wx_series = np.arange(len(invindex.words))
>>> idx2_aid = invindex.idx2_daid
>>> daid2_label = invindex.daid2_label
>>> wx2_idf = compute_word_idf_(wx_series, wx2_idxs, idx2_aid, daids)
>>> result = str(len(wx2_idf))
>>> print(result)
8000

Auto:: from ibeis.algo.hots.smk import smk_index import utool as ut; print(ut.make_default_docstr(smk_index.compute_word_idf_))

ibeis.algo.hots.smk.smk_index.helper_idf_wordgroup(wx2_idxs, idx2_aid, wx_series)[source]¶: helper function

ibeis.algo.hots.smk.smk_index.learn_visual_words(ibs, config2_=None, use_cache=True, memtrack=None)[source]¶

Computes and caches visual words

Parameters:	ibs – qreq (QueryRequest) – query request object with hyper-parameters use_cache (bool) – turns on disk based caching(default = True) memtrack (None) – (default = None)
Returns:	words - aggregate descriptor cluster centers
Return type:	ndarray[uint8_t, ndim=2]
Returns:	words

Example

>>> # SLOW_DOCTEST
>>> from ibeis.algo.hots.smk.smk_index import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> ibs, annots_df, taids, daids, qaids, qreq_, nWords = smk_debug.testdata_dataframe()
>>> use_cache = True
>>> words = learn_visual_words(ibs, qreq_)
>>> print(words.shape)
(8000, 128)

Example

>>> # SLOW_DOCTEST
>>> from ibeis.algo.hots.smk.smk_index import *  # NOQA
>>> import ibeis
>>> ibs = ibeis.opendb('PZ_Master1')
>>> config2_ = ibs.new_query_params(cfgdict=dict(nWords=128000))
>>> use_cache = True
>>> words = learn_visual_words(ibs, config2_)
>>> print(words.shape)
(8000, 128)

Auto:: from ibeis.algo.hots.smk import smk_index import utool as ut argdoc = ut.make_default_docstr(smk_index.learn_visual_words) print(argdoc)

ibeis.algo.hots.smk.smk_match module¶

ibeis.algo.hots.smk.smk_match.execute_smk_L3(annots_df, qaid, invindex, qparams, withinfo=True)[source]¶

Executes a single smk query

Example

>>> from ibeis.algo.hots.smk.smk_match import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> ibs, annots_df, daids, qaids, invindex, qreq_ = smk_debug.testdata_internals_full()
>>> qaid = qaids[0]
>>> qparams = qreq_.qparams
>>> withinfo = True
>>> daid2_totalscore, daid2_chipmatch = execute_smk_L3(annots_df, qaid, invindex, qparams, withinfo)

ibeis.algo.hots.smk.smk_match.execute_smk_L4(annots_df, qaids, invindex, qparams, withinfo)[source]¶

Loop over execute_smk_L3

CommandLine:

python dev.py -t smk --allgt --db PZ_Mothers --index 1:3 --noqcache --va --vf

ibeis.algo.hots.smk.smk_match.execute_smk_L5(qreq_)[source]¶

ibeis query interface

Example

>>> from ibeis.algo.hots.smk.smk_match import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_match
>>> from ibeis.algo.hots.smk import smk_debug
>>> ibs, annots_df, daids, qaids, invindex, qreq_ = smk_debug.testdata_internals_full()
>>> qaid2_scores, qaid2_chipmatch = smk_match.execute_smk_L5(qreq_)

Dev::

from ibeis.algo.hots import pipeline filt2_meta = {} # Get both spatial verified and not qaid2_chipmatch_FILT_ = qaid2_chipmatch qaid2_chipmatch_SVER_ = pipeline.spatial_verification(qaid2_chipmatch_FILT_, qreq_) qaid2_qres_FILT_ = pipeline.chipmatch_to_resdict(qaid2_chipmatch_FILT_, filt2_meta, qreq_) qaid2_qres_SVER_ = pipeline.chipmatch_to_resdict(qaid2_chipmatch_SVER_, filt2_meta, qreq_) qres_FILT = qaid2_qres_FILT_[qaids[0]] qres_SVER = qaid2_qres_SVER_[qaids[0]] fig1 = qres_FILT.show_top(ibs, fnum=1, figtitle=’filt’) fig2 = qres_SVER.show_top(ibs, fnum=2, figtitle=’sver’) fig1.show() fig2.show()

CommandLine::

python -m memory_profiler dev.py –db PZ_Mothers -t smk2 –allgt –index 0 python dev.py -t smk2 –allgt –db GZ_ALL python dev.py -t smk2 –allgt –db GZ_ALL python dev.py -t smk2 –allgt –db GZ_ALL –index 2:10 –vf –va python dev.py -t smk2 –allgt –db GZ_ALL –index 2:10 –vf –va –print-cfgstr python dev.py -t smk2 –allgt –db GZ_ALL –index 2:20 –vf –va python dev.py -t smk2 –allgt –db GZ_ALL –noqcache –index 2:20 –va –vf python dev.py -t smk2 –allgt –db PZ_Master0 && python dev.py -t smk3 –allgt –db PZ_Master0 python dev.py -t smk2 –allgt –db PZ_Master0 –index 2:10 –va python dev.py -t smk2 –allgt –db PZ_Mothers –index 20:30 python dev.py -t smk2 –allgt –db PZ_Mothers –noqcache –index 18:20 –super-strict –va python dev.py -t smk2 –db PZ_Master0 –qaid 7199 –va –quality –vf –noqcache python dev.py -t smk3 –allgt –db GZ_ALL –index 2:10 –vf –va python dev.py -t smk5 –allgt –db PZ_Master0 –noqcache ; python dev.py -t smk5 –allgt –db GZ_ALL –noqcache python dev.py -t smkd –allgt –db PZ_Mothers –index 1:3 –va –quality –vf –noqcache

python dev.py -t smk_8k –allgt –db PZ_Mothers –index 20:30 –va –vf python dev.py -t smk_8k –allgt –db PZ_Mothers –index 20:30 –echo-hardcase python dev.py -t smk_8k –allgt –db PZ_Mothers –index 20:30 –vh python dev.py -t smk_8k_compare –allgt –db PZ_Mothers –index 20:30 –view-hard

ibeis.algo.hots.smk.smk_match.prepare_qreq(qreq_, annots_df, memtrack)[source]¶: Called if pipeline did not setup qreq correctly

ibeis.algo.hots.smk.smk_plots module¶

Algorithm:: Feature Weighting Viewpoints Labels Choose Examplars based on Scores Normalizing Scores Per Name Incremental Version

class ibeis.algo.hots.smk.smk_plots.Metrics(wx2_nMembers, wx2_pdist_stats, wx2_wdist_stats)¶

Bases: tuple

__getnewargs__()¶: Return self as a plain tuple. Used by copy and pickle.

__getstate__()¶: Exclude the OrderedDict from pickling

__repr__()¶: Return a nicely formatted representation string

wx2_nMembers¶: Alias for field number 0

wx2_pdist_stats¶: Alias for field number 1

wx2_wdist_stats¶: Alias for field number 2

ibeis.algo.hots.smk.smk_plots.compute_word_metrics(invindex)[source]¶

ibeis.algo.hots.smk.smk_plots.draw_scatterplot(figdir, ibs, datax, datay, xlabel, ylabel, color, fnum=None)[source]¶

ibeis.algo.hots.smk.smk_plots.dump_word_patches(ibs, vocabdir, invindex, wx_sample, metrics)[source]¶: Dumps word member patches to disk

ibeis.algo.hots.smk.smk_plots.get_cached_vocabs()[source]¶

ibeis.algo.hots.smk.smk_plots.get_metric(metrics, tupkey, statkey=None)[source]¶

ibeis.algo.hots.smk.smk_plots.get_qres_and_closest_valid_k(ibs, aid, K=4)[source]¶

Example

>>> from ibeis.algo.hots.smk.smk_plots import *  # NOQA
>>> import numpy as np
>>> from ibeis.algo.hots import query_request
>>> import ibeis
>>> ibs = ibeis.opendb('testdb1')
>>> aid = 2

ibeis.algo.hots.smk.smk_plots.get_word_dname(wx, metrics)[source]¶

ibeis.algo.hots.smk.smk_plots.get_word_dpaths(vocabdir, wx_sample, metrics)[source]¶: Gets word folder names and ensure they exist

ibeis.algo.hots.smk.smk_plots.main_options()[source]¶

ibeis.algo.hots.smk.smk_plots.make_scatterplots(ibs, figdir, invindex, metrics)[source]¶

ibeis.algo.hots.smk.smk_plots.make_wordfigures(ibs, metrics, invindex, figdir, wx_sample, wx2_dpath)[source]¶: Builds mosaics of patches assigned to words in sample ouptuts them to disk

ibeis.algo.hots.smk.smk_plots.metric_clamped_stat(metrics, wx_list, key)[source]¶: if key is a tuple it specifies a statdict and a chosen stat else its just a key

ibeis.algo.hots.smk.smk_plots.plot_chip_metric(ibs, aid, metric=None, fnum=1, lbl='', figtitle='', colortype='score', darken=0.5, cmap_='hot', reverse_cmap=False, **kwargs)[source]¶

Plots one annotation with one metric.

The word metric is used liberally.

Example

>>> from ibeis.algo.hots.smk.smk_plots import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> from ibeis.algo.hots.smk import smk_plots
>>> from ibeis.algo.hots.smk import smk_repr
>>> #tup = smk_debug.testdata_raw_internals0(db='GZ_ALL', nWords=64000)
>>> #tup = smk_debug.testdata_raw_internals0(db='GZ_ALL', nWords=8000)
>>> #tup = smk_debug.testdata_raw_internals0(db='PZ_Master0', nWords=64000)
>>> tup = smk_debug.testdata_raw_internals0(db='PZ_Mothers', nWords=8000)
>>> ibs, annots_df, daids, qaids, invindex, qreq_ = tup
>>> smk_repr.compute_data_internals_(invindex, qreq_.qparams, delete_rawvecs=False)
>>> invindex.idx2_wxs = np.array(invindex.idx2_wxs)
>>> metric = None
>>> aid = 1
>>> fnum = 0
>>> lbl='test'
>>> colortype='score'
>>> kwargs = {'annote': False}
#>>> df2.rrr()
>>> smk_plots.plot_chip_metric(ibs, aid, metric, fnum, lbl, colortype, **kwargs)
>>> df2.present()

ibeis.algo.hots.smk.smk_plots.present()[source]¶

ibeis.algo.hots.smk.smk_plots.select_by_metric(wx2_metric, per_quantile=20)[source]¶

ibeis.algo.hots.smk.smk_plots.smk_plots_main()[source]¶

smk python smk_plots.py –db PZ_MTEST –notoolbar

CommandLine:

python -m ibeis.algo.hots.smk.smk_plots --test-smk_plots_main
python -m ibeis.algo.hots.smk.smk_plots --test-smk_plots_main --db PZ_MTEST --notoolbar

Example

>>> # DISABLE_DOCTEST
>>> from ibeis.algo.hots.smk.smk_plots import *  # NOQA
>>> smk_plots_main()

ibeis.algo.hots.smk.smk_plots.view_vocabs()[source]¶

looks in vocab cachedir and prints info / vizualizes the vocabs using PCA

CommandLine:

python -m ibeis.algo.hots.smk.smk_plots --test-view_vocabs --show

Example

>>> # DISABLE_DOCTEST
>>> from ibeis.algo.hots.smk.smk_plots import *  # NOQA
>>> # build test data
>>> # execute function
>>> view_vocabs()
>>> ut.quit_if_noshow()
>>> ut.show_if_requested()

ibeis.algo.hots.smk.smk_plots.viz_annot_with_metrics(ibs, invindex, aid, metrics, metric_keys=['wx2_nMembers', ('wx2_pdist_stats', 'mean'), ('wx2_wdist_stats', 'mean')], show_orig=True, show_idf=True, show_words=False, show_analysis=True, show_aveprecision=True, qfx2_closest_k_list=None, show_word_correct_assignments=False, qres_list=None)[source]¶

Parameters:	ibs (IBEISController) – invindex (InvertedIndex) – object for fast vocab lookup aid (int) – metrics (namedtuple) –

Example

>>> from ibeis.algo.hots.smk.smk_plots import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> from ibeis.algo.hots.smk import smk_repr
>>> #tup = smk_debug.testdata_raw_internals0(db='GZ_ALL', nWords=64000)
>>> #tup = smk_debug.testdata_raw_internals0(db='GZ_ALL', nWords=8000)
>>> #tup = smk_debug.testdata_raw_internals0(db='PZ_Master0', nWords=64000)
>>> tup = smk_debug.testdata_raw_internals0(db='PZ_Mothers', nWords=8000)
>>> ibs, annots_df, daids, qaids, invindex, qreq_ = tup
>>> smk_repr.compute_data_internals_(invindex, qreq_.qparams, delete_rawvecs=False)
>>> invindex.idx2_wxs = np.array(invindex.idx2_wxs)
>>> metric_keys=['wx2_nMembers', ('wx2_pdist_stats', 'mean'), ('wx2_wdist_stats', 'mean')]
>>> metrics = compute_word_metrics(invindex)
>>> aid = 1

ibeis.algo.hots.smk.smk_plots.vizualize_vocabulary(ibs, invindex)[source]¶

cleaned up version of dump_word_patches. Makes idf scatter plots and dumps the patches that contributed to each word.

CommandLine:

python -m ibeis.algo.hots.smk.smk_plots --test-vizualize_vocabulary
python -m ibeis.algo.hots.smk.smk_plots --test-vizualize_vocabulary --vf

Example

>>> from ibeis.algo.hots.smk.smk_plots import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> from ibeis.algo.hots.smk import smk_repr
>>> #tup = smk_debug.testdata_raw_internals0(db='GZ_ALL', nWords=64000)
>>> #tup = smk_debug.testdata_raw_internals0(db='GZ_ALL', nWords=8000)
>>> tup = smk_debug.testdata_raw_internals0(db='PZ_Master0', nWords=64000)
>>> #tup = smk_debug.testdata_raw_internals0(db='PZ_Mothers', nWords=8000)
>>> ibs, annots_df, daids, qaids, invindex, qreq_ = tup
>>> smk_repr.compute_data_internals_(invindex, qreq_.qparams, delete_rawvecs=False)
>>> vizualize_vocabulary(ibs, invindex)

ibeis.algo.hots.smk.smk_repr module¶

class ibeis.algo.hots.smk.smk_repr.DataFrameProxy(annots_df, ibs)[source]¶

Bases: object

DEPRICATE

pandas is actually really slow. This class emulates it so I don’t have to change my function calls, but without all the slowness.

class ibeis.algo.hots.smk.smk_repr.InvertedIndex(invindex, words, wordflann, idx2_vec, idx2_aid, idx2_fx, daids, daid2_label)[source]¶

Bases: object

Stores inverted index state information (mapping from words to database aids and fxs_list)

idx2_dvec¶: ndarray[S x DIM] – stacked index -> descriptor vector (currently sift)

idx2_daid¶: ndarray[S x 1] – stacked index -> annot id

idx2_dfx¶: ndarray[S x 1] – stacked index -> feature index (wrt daid)

idx2_fweight¶: ndarray[S x 1] – stacked index -> feature weight

idx2_wxs¶: list – stacked index -> word indexes (jagged)

words¶: ndarray[C x DIM] – visual word centroids

wordflann¶: FLANN – FLANN search structure

wx2_idxs¶: dict of lists of ndarrays – word index -> stacked indexes

wx2_fxs¶: dict of lists of ndarrays – word index -> aggregate feature indexes

wx2_aids¶: dict of ndarrays[N_c x 1] – word index -> aggregate aids

wx2_drvecs¶: dict of ndarrays[N_c x DIM] – word index -> residual vectors

wx2_dflags¶: dict of ndarrays[N_c x 1] – word index -> residual flags

wx2_idf¶: dict of ndarrays[N_c x 1] – word index -> idf (wx normalizer)

wx2_maws¶: dict of ndarrays[N_c x 1] – word index -> multi-assign weights

daids¶: ndarray – indexed annotation ids

daid2_sccw¶: dict of floats – daid -> sccw (daid self-consistency weight)

daid2_label¶: dict of tuples – daid -> label (name, view)

rrr(verbose=True)¶: special class reloading function

class ibeis.algo.hots.smk.smk_repr.LazyGetter(getter_func)[source]¶

Bases: object

DEPRICATE

class ibeis.algo.hots.smk.smk_repr.QueryIndex(wx2_qrvecs, wx2_qflags, wx2_maws, wx2_qaids, wx2_qfxs, query_sccw)¶

Bases: tuple

__getnewargs__()¶: Return self as a plain tuple. Used by copy and pickle.

__getstate__()¶: Exclude the OrderedDict from pickling

__repr__()¶: Return a nicely formatted representation string

query_sccw¶: Alias for field number 5

wx2_maws¶: Alias for field number 2

wx2_qaids¶: Alias for field number 3

wx2_qflags¶: Alias for field number 1

wx2_qfxs¶: Alias for field number 4

wx2_qrvecs¶: Alias for field number 0

ibeis.algo.hots.smk.smk_repr.compute_data_internals_(invindex, qparams, memtrack=None, delete_rawvecs=True)[source]¶

Builds each of the inverted index internals.

invindex (InvertedIndex): object for fast vocab lookup qparams (QueryParams): hyper-parameters memtrack (None): delete_rawvecs (bool):

Returns:	None

Example

>>> from ibeis.algo.hots.smk.smk_repr import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> ibs, annots_df, daids, qaids, invindex, qreq_ = smk_debug.testdata_raw_internals0()
>>> compute_data_internals_(invindex, qreq_.qparams)

ibeis.algo.hots.smk.smk_repr.index_data_annots(annots_df, daids, words, qparams, with_internals=True, memtrack=None, delete_rawvecs=False)[source]¶

Builds the initial inverted index from a dataframe, daids, and words. Optionally builds the internals of the inverted structure

Parameters:	() (memtrack) – () – () – () – () – () – memory debugging object
Returns:	invindex

Example

>>> from ibeis.algo.hots.smk.smk_repr import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> ibs, annots_df, daids, qaids, qreq_, words = smk_debug.testdata_words()
>>> qparams = qreq_.qparams
>>> with_internals = False
>>> invindex = index_data_annots(annots_df, daids, words, qparams, with_internals)

Auto:: from ibeis.algo.hots.smk import smk_repr import utool as ut ut.rrrr() print(ut.make_default_docstr(smk_repr.index_data_annots))

ibeis.algo.hots.smk.smk_repr.make_annot_df(ibs)[source]¶

Creates a pandas like DataFrame interface to an IBEISController

DEPRICATE

Parameters:	() (ibs) –
Returns:	annots_df

Example

>>> from ibeis.algo.hots.smk.smk_repr import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> ibs = smk_debug.testdata_ibeis()
>>> annots_df = make_annot_df(ibs)
>>> print(ut.hashstr(repr(annots_df.values)))
j12n+x93m4c!4un3

#>>> from ibeis.algo.hots.smk import smk_debug #>>> smk_debug.rrr() #>>> smk_debug.check_dtype(annots_df)

Auto:: from ibeis.algo.hots.smk import smk_repr import utool as ut argdoc = ut.make_default_docstr(smk_repr.make_annot_df) print(argdoc)

ibeis.algo.hots.smk.smk_repr.new_qindex(annots_df, qaid, invindex, qparams)[source]¶

Gets query read for computations

Parameters:	annots_df (DataFrameProxy) – pandas-like data interface qaid (int) – query annotation id invindex (InvertedIndex) – inverted index object qparams (QueryParams) – query parameters object
Returns:	named tuple containing query information
Return type:	qindex

CommandLine:

python -m ibeis.algo.hots.smk.smk_repr --test-new_qindex

Example

>>> # DISABLE_DOCTEST
>>> from ibeis.algo.hots.smk.smk_repr import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> ibs, annots_df, qaid, invindex, qparams = smk_debug.testdata_query_repr(db='PZ_Mothers', nWords=128000)
>>> qindex = new_qindex(annots_df, qaid, invindex, qparams)
>>> assert smk_debug.check_wx2_rvecs(qindex.wx2_qrvecs), 'has nan'
>>> smk_debug.invindex_dbgstr(invindex)

Ignore::: idx2_vec = qfx2_vec idx2_aid = qfx2_aid idx2_fx = qfx2_qfx wx2_idxs = _wx2_qfxs wx2_maws = _wx2_maws from ibeis.algo.hots.smk import smk_repr import utool as ut ut.rrrr() print(ut.make_default_docstr(smk_repr.new_qindex))

ibeis.algo.hots.smk.smk_repr.report_memory(obj, objname='obj')[source]¶: obj = invindex objname = ‘invindex’

ibeis.algo.hots.smk.smk_residuals module¶

ibeis.algo.hots.smk.smk_residuals.aggregate_rvecs(rvecs, maws)[source]¶

helper for compute_agg_rvecs

Parameters:	rvecs (ndarray) – residual vectors maws (ndarray) – multi assign weights
Returns:	aggregated residual vectors
Return type:	rvecs_agg

CommandLine:

python -m ibeis.algo.hots.smk.smk_residuals --test-aggregate_rvecs
./run_tests.py --exclude-doctest-patterns pipeline neighbor score coverage automated_helpers name automatch chip_match multi_index automated special_query scoring automated nn_weights distinctive match_chips4 query_request devcases hstypes params ibsfuncs smk_core, smk_debug control

Example

>>> # ENABLE_DOCTEST
>>> from ibeis.algo.hots.smk.smk_residuals import *  # NOQA
>>> rng = np.random.RandomState(0)
>>> rvecs = (hstypes.RVEC_MAX * rng.rand(4, 128)).astype(hstypes.RVEC_TYPE)
>>> maws  = (rng.rand(rvecs.shape[0])).astype(hstypes.FLOAT_TYPE)
>>> rvecs_agg = aggregate_rvecs(rvecs, maws)
>>> result = ut.numpy_str2(rvecs_agg, linewidth=70)
>>> print(result)
np.array([[28, 27, 32, 16, 16, 16, 12, 31, 27, 29, 19, 27, 21, 24, 15,
           21, 17, 37, 13, 40, 38, 33, 17, 30, 13, 23,  9, 25, 19, 15,
           20, 17, 19, 18, 13, 25, 37, 29, 21, 16, 20, 21, 34, 11, 28,
           19, 17, 12, 14, 24, 21, 11, 27, 11, 24, 10, 23, 20, 28, 12,
           16, 14, 30, 22, 18, 26, 21, 20, 18,  9, 29, 20, 25, 19, 23,
           20,  7, 13, 22, 22, 15, 20, 22, 16, 27, 10, 16, 20, 25, 25,
           26, 28, 22, 38, 24, 16, 14, 19, 24, 14, 22, 19, 19, 33, 21,
           22, 18, 22, 25, 25, 22, 23, 32, 16, 25, 15, 29, 21, 25, 20,
           22, 31, 29, 24, 24, 25, 20, 14]], dtype=np.int8)

ibeis.algo.hots.smk.smk_residuals.compress_normvec(arr_float)¶

compresses 8 or 4 bytes of information into 1 byte Assumes RVEC_TYPE is int8

Takes a normalized float vectors in range -1 to 1 with l2norm=1 and compresses them into 1 byte. Takes advantage of the fact that rarely will a component of a vector be greater than 64, so we can extend the range to double what normally would be allowed. This does mean there is a slight (but hopefully negligable) information loss. It will be negligable when nDims=128, when it is lower, you may want to use a different function.

Parameters:	arr_float (ndarray) – normalized residual vector of type float in range -1 to 1 (with l2 norm of 1)
Returns:	residual vector of type int8 in range -128 to 128
Return type:	(ndarray)

CommandLine:

python -m ibeis.algo.hots.smk.smk_residuals --test-compress_normvec_uint8

Example

>>> # ENABLE_DOCTEST
>>> from ibeis.algo.hots.smk.smk_residuals import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> rng = np.random.RandomState(0)
>>> arr_float = smk_debug.get_test_float_norm_rvecs(2, 5, rng=rng)
>>> vt.normalize_rows(arr_float, out=arr_float)
>>> arr_int8 = compress_normvec_uint8(arr_float)
>>> result = arr_int8
>>> print(result)
[[ 126   29   70  127  127]
 [-127  127  -27  -18   73]]

ibeis.algo.hots.smk.smk_residuals.compress_normvec_float16(arr_float)[source]¶

CURRENTLY THIS IS NOT USED. WE ARE WORKING WITH INT8 INSTEAD

compresses 8 or 4 bytes of information into 2 bytes Assumes RVEC_TYPE is float16

Parameters:	arr_float (ndarray) –
Returns:	ndarray[dtype=np.float16]

CommandLine:

python -m ibeis.algo.hots.smk.smk_residuals --test-compress_normvec_float16

Example

>>> # ENABLE_DOCTEST
>>> from ibeis.algo.hots.smk.smk_residuals import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> rng = np.random.RandomState(0)
>>> arr_float = smk_debug.get_test_float_norm_rvecs(2, 5, rng=rng)
>>> vt.normalize_rows(arr_float, out=arr_float)
>>> arr_float16 = compress_normvec_float16(arr_float)
>>> result = ut.numpy_str(arr_float16, precision=4)
>>> print(result)
np.array([[ 0.4941,  0.1121,  0.2742,  0.6279,  0.5234],
          [-0.6812,  0.6621, -0.1055, -0.0719,  0.2861]], dtype=np.float16)

ibeis.algo.hots.smk.smk_residuals.compress_normvec_uint8(arr_float)[source]¶

compresses 8 or 4 bytes of information into 1 byte Assumes RVEC_TYPE is int8

Takes a normalized float vectors in range -1 to 1 with l2norm=1 and compresses them into 1 byte. Takes advantage of the fact that rarely will a component of a vector be greater than 64, so we can extend the range to double what normally would be allowed. This does mean there is a slight (but hopefully negligable) information loss. It will be negligable when nDims=128, when it is lower, you may want to use a different function.

Parameters:	arr_float (ndarray) – normalized residual vector of type float in range -1 to 1 (with l2 norm of 1)
Returns:	residual vector of type int8 in range -128 to 128
Return type:	(ndarray)

CommandLine:

python -m ibeis.algo.hots.smk.smk_residuals --test-compress_normvec_uint8

Example

>>> # ENABLE_DOCTEST
>>> from ibeis.algo.hots.smk.smk_residuals import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> rng = np.random.RandomState(0)
>>> arr_float = smk_debug.get_test_float_norm_rvecs(2, 5, rng=rng)
>>> vt.normalize_rows(arr_float, out=arr_float)
>>> arr_int8 = compress_normvec_uint8(arr_float)
>>> result = arr_int8
>>> print(result)
[[ 126   29   70  127  127]
 [-127  127  -27  -18   73]]

ibeis.algo.hots.smk.smk_residuals.compute_agg_rvecs(rvecs_list, idxs_list, aids_list, maws_list)[source]¶

Driver function for agg residual computation

Sums and normalizes all rvecs that belong to the same word and the same annotation id

Parameters:	rvecs_list (list) – residual vectors grouped by word idxs_list (list) – stacked descriptor indexes grouped by word aids_list (list) – annotation rowid for each stacked descriptor index maws_list (list) – multi assign weights
Returns:	(aggvecs_list, aggaids_list, aggidxs_list, aggmaws_list)
Return type:	tuple

CommandLine:

python -m ibeis.algo.hots.smk.smk_residuals --test-compute_agg_rvecs

Example

>>> # SLOW_DOCTEST
>>> from ibeis.algo.hots.smk.smk_residuals import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> from ibeis.algo.hots.smk import smk_residuals
>>> words, wx_sublist, aids_list, idxs_list, idx2_vec, maws_list = smk_debug.testdata_nonagg_rvec()
>>> rvecs_list, flags_list = smk_residuals.compute_nonagg_rvecs(words, idx2_vec, wx_sublist, idxs_list)
>>> tup = compute_agg_rvecs(rvecs_list, idxs_list, aids_list, maws_list)
>>> aggvecs_list, aggaids_list, aggidxs_list, aggmaws_list, aggflags_list = tup
>>> ut.assert_eq(len(wx_sublist), len(rvecs_list))

ibeis.algo.hots.smk.smk_residuals.compute_nonagg_rvecs(words, idx2_vec, wx_sublist, idxs_list)[source]¶

Driver function for nonagg residual computation

Parameters:	words (ndarray) – array of words idx2_vec (dict) – stacked vectors wx_sublist (list) – words of interest idxs_list (list) – list of idxs grouped by wx_sublist
Returns:	(rvecs_list, flags_list)
Return type:	tuple

CommandLine:

python -m ibeis.algo.hots.smk.smk_residuals --test-compute_nonagg_rvecs:0
python -m ibeis.algo.hots.smk.smk_residuals --test-compute_nonagg_rvecs:1

Example

>>> # SLOW_DOCTEST
>>> from ibeis.algo.hots.smk.smk_residuals import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> from ibeis.algo.hots.smk import smk_residuals
>>> words, wx_sublist, aids_list, idxs_list, idx2_vec, maws_list = smk_debug.testdata_nonagg_rvec()
>>> rvecs_list, flags_list = smk_residuals.compute_nonagg_rvecs(words, idx2_vec, wx_sublist, idxs_list)
>>> print('Computed size(rvecs_list) = %r' % ut.get_object_size_str(rvecs_list))
>>> print('Computed size(flags_list) = %r' % ut.get_object_size_str(flags_list))

Example2:

>>> # ENABLE_DOCTEST
>>> # The case where vecs == words
>>> from ibeis.algo.hots.smk.smk_residuals import *  # NOQA
>>> rng = np.random.RandomState(0)
>>> vecs = (hstypes.VEC_MAX * rng.rand(4, 128)).astype(hstypes.VEC_TYPE)
>>> word = vecs[1]
>>> words = word.reshape(1, 128)
>>> idx2_vec = vecs
>>> idxs_list = [np.arange(len(vecs), dtype=np.int32)]
>>> wx_sublist = [0]
>>> rvecs_list, flags_list = compute_nonagg_rvecs(words, idx2_vec, wx_sublist, idxs_list)
>>> rvecs = rvecs_list[0]
>>> flags = flags_list[0]
>>> maws  = (np.ones(rvecs.shape[0])).astype(hstypes.FLOAT_TYPE)
>>> maws_list = np.array([maws])
>>> aids_list = np.array([np.arange(len(vecs))])

Timeit:

%timeit [~np.any(vecs, axis=1) for vecs in vecs_list] %timeit [vecs.sum(axis=1) == 0 for vecs in vecs_list]

ibeis.algo.hots.smk.smk_residuals.get_norm_residuals(vecs, word)[source]¶

computes normalized residuals of vectors with respect to a word

Parameters:	vecs (ndarray) – word (ndarray) –
Returns:	(rvecs_n, rvec_flag)
Return type:	tuple

CommandLine:

python -m ibeis.algo.hots.smk.smk_residuals --test-get_norm_residuals

Example

>>> # ENABLE_DOCTEST
>>> # The case where vecs != words
>>> from ibeis.algo.hots.smk.smk_residuals import *  # NOQA
>>> rng = np.random.RandomState(0)
>>> vecs = (hstypes.VEC_MAX * rng.rand(4, 128)).astype(hstypes.VEC_TYPE)
>>> word = (hstypes.VEC_MAX * rng.rand(1, 128)).astype(hstypes.VEC_TYPE)
>>> rvecs_n = get_norm_residuals(vecs, word)
>>> result = ut.numpy_str2(rvecs_n)
>>> print(result)

Example

>>> # ENABLE_DOCTEST
>>> # The case where vecs == words
>>> from ibeis.algo.hots.smk.smk_residuals import *  # NOQA
>>> rng = np.random.RandomState(0)
>>> vecs = (hstypes.VEC_MAX * rng.rand(4, 128)).astype(hstypes.VEC_TYPE)
>>> word = vecs[1]
>>> rvecs_n = get_norm_residuals(vecs, word)
>>> result = ut.numpy_str2(rvecs_n)
>>> print(result)

ibeis.algo.hots.smk.smk_scoring module¶

The functions for scoring smk matches

ibeis.algo.hots.smk.smk_scoring.apply_weights(simmat_list, qmaws_list, dmaws_list, idf_list)[source]¶

Applys multi-assign weights and idf weights to rvec similarty matrices

TODO: Maybe should apply the sccw weights too?

Accounts for rvecs being stored as int8’s

Example

>>> from ibeis.algo.hots.smk.smk_scoring import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> simmat_list, qmaws_list, dmaws_list, idf_list = smk_debug.testdata_apply_weights()
>>> wsim_list = apply_weights(simmat_list, qmaws_list, dmaws_list, idf_list)

ibeis.algo.hots.smk.smk_scoring.rvecs_dot_uint8(qrvecs, drvecs)[source]¶

ibeis.algo.hots.smk.smk_scoring.sccw_summation(rvecs_list, flags_list, idf_list, maws_list, smk_alpha, smk_thresh)[source]¶

Computes gamma from “To Aggregate or not to aggregate”. Every component in each list is with repsect to a different word.

scc = self consistency criterion It is a scalar which ensure K(X, X) = 1

Parameters:	rvecs_list (list of ndarrays) – residual vectors for every word idf_list (list of floats) – idf weight for each word maws_list (list of ndarrays) – multi-assign weights for each word for each residual vector smk_alpha (float) – selectivity power smk_thresh (float) – selectivity threshold
Returns:	sccw self-consistency-criterion weight
Return type:	float

Math:: begin{equation} gamma(X) = (sum_{c in C} w_c M(X_c, X_c))^{-.5} end{equation}

Example

>>> from ibeis.algo.hots.smk.smk_scoring import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_scoring
>>> from ibeis.algo.hots.smk import smk_debug
>>> #idf_list, rvecs_list, maws_list, smk_alpha, smk_thresh, wx2_flags = smk_debug.testdata_sccw_sum(db='testdb1')
>>> tup = smk_debug.testdata_sccw_sum(db='PZ_MTEST', nWords=128000)
>>> idf_list, rvecs_list, flags_list, maws_list, smk_alpha, smk_thresh = tup
>>> sccw = smk_scoring.sccw_summation(rvecs_list, flags_list, idf_list, maws_list, smk_alpha, smk_thresh)
>>> print(sccw)
0.0201041835751

CommandLine:

python smk_match.py --db PZ_MOTHERS --nWords 128

ibeis.algo.hots.smk.smk_scoring.score_matches(qrvecs_list, drvecs_list, qflags_list, dflags_list, qmaws_list, dmaws_list, smk_alpha, smk_thresh, idf_list)[source]¶

Similarity + Selectivity: M(X_c, Y_c)

Computes the similarity matrix between word correspondences

Parameters:	qrvecs_list – query vectors for each word drvecs_list – database vectors for each word qmaws_list – multi assigned weights for each query word dmaws_list – multi assigned weights for each database word smk_alpha – selectivity power smk_thresh – selectivity smk_thresh
Returns:	list of score matrices
Return type:	list

References

https://lear.inrialpes.fr/~douze/enseignement/2013-2014/presentation_papers/tolias_aggregate.pdf

Example

>>> from ibeis.algo.hots.smk.smk_scoring import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> smk_alpha = 3
>>> smk_thresh = 0
>>> qrvecs_list = [smk_debug.get_test_rvecs(_) for _ in range(10)]
>>> drvecs_list = [smk_debug.get_test_rvecs(_) for _ in range(10)]
>>> qmaws_list  = [smk_debug.get_test_maws(rvecs) for rvecs in qrvecs_list]
>>> dmaws_list  = [np.ones(rvecs.shape[0], dtype=hstypes.FLOAT_TYPE) for rvecs in qrvecs_list]
>>> idf_list = [1.0 for _ in qrvecs_list]
>>> scores_list = score_matches(qrvecs_list, drvecs_list, qmaws_list, dmaws_list, smk_alpha, smk_thresh, idf_list)

ibeis.algo.hots.smk.smk_scoring.selectivity_function(wsim_list, smk_alpha, smk_thresh)[source]¶

Selectivity function - sigma from SMK paper rscore = residual score

Downweights weak matches using power law normalization and thresholds anybody that is too weak

Example:

>>> import numpy as np
>>> from ibeis.algo.hots.smk import smk_debug
>>> smk_debug.rrr()
>>> np.random.seed(0)
>>> wsim_list, smk_alpha, smk_thresh = smk_debug.testdata_selectivity_function()

Timeits:

>>> import utool
>>> utool.util_dev.rrr()
>>> setup = utool.codeblock(
...     '''
        import numpy as np
        import scipy.sparse as spsparse
        from ibeis.algo.hots.smk import smk_debug
        np.random.seed(0)
        wsim_list, smk_alpha, smk_thresh = smk_debug.testdata_selectivity_function()
        scores_iter = [
            np.multiply(np.sign(mawmat), np.power(np.abs(mawmat), smk_alpha))
            for mawmat in wsim_list
        ]
        ''')
>>> stmt_list = utool.codeblock(
...     '''
        scores_list0 = [np.multiply(scores, np.greater(scores, smk_thresh)) for scores in scores_iter]
        scores_list1 = [spsparse.coo_matrix(np.multiply(scores, np.greater(scores, smk_thresh))) for scores in scores_iter]
        scores_list2 = [spsparse.dok_matrix(np.multiply(scores, np.greater(scores, smk_thresh))) for scores in scores_iter]
        scores_list3 = [spsparse.lil_matrix(np.multiply(scores, np.greater(scores, smk_thresh))) for scores in scores_iter]
        '''
... ).split('

‘)

>>> utool.util_dev.timeit_compare(stmt_list, setup, int(1E4))

scores0 = scores_list0[-1] scores1 = scores_list1[-1] scores2 = scores_list2[-1] scores3 = scores_list3[-1] %timeit scores0.sum() %timeit scores1.sum() %timeit scores2.sum() %timeit scores3.sum()

ibeis.algo.hots.smk.smk_scoring.similarity_function(qrvecs_list, drvecs_list, qflags_list, dflags_list)[source]¶

Phi dot product.

Parameters:	qrvecs_list (list) – query residual vectors for each matching word drvecs_list (list) – corresponding database residual vectors qflags_list (list) – indicates if a query vector was nan dflags_list (list) – indicates if a database vector was nan
Returns:	simmat_list

qrvecs_list list of rvecs for each word

Example

>>> from ibeis.algo.hots.smk.smk_scoring import *  # NOQA
>>> from ibeis.algo.hots.smk import smk_debug
>>> qrvecs_list, drvecs_list = smk_debug.testdata_similarity_function()
>>> simmat_list = similarity_function(qrvecs_list, drvecs_list)

Module contents¶

ibeis.algo.hots.smk.reload_subs()[source]¶: Reloads smk and submodules

ibeis.algo.hots.smk.rrrr()¶: Reloads smk and submodules