Algorithms API#

Base functions#

paspailleur.algorithms.base_functions.extension(pattern: Pattern, objects_per_pattern: dict[Pattern, bitarray]) → bitarray#

Return the extent of a pattern (a set of objects whose patterns are more precise than pattern).

Parameters:

pattern (Pattern) – The pattern whose extent is computed.
objects_per_pattern (dict[Pattern, bitarray]) – Matches patterns to bitarrays representing associated objects.

Returns:

extent – A bitarray representing the extent of the input pattern.

Return type:

bitarray

Examples

>>> objects_patterns = [Pattern(frozenset('abc')), Pattern(frozenset('abc')), Pattern(frozenset('abcd')), Pattern(frozenset('acde'))]
>>> from paspailleur.algorithms import base_functions as bfuncs
>>> obj_to_patterns = bfuncs.group_objects_by_patterns(objects_patterns)
>>> bfuncs.extension(p, obj_to_patterns)
bitarray('1111')

Notes

The “objects_per_pattern” dictionary can be created from objects’ descriptions using “group_objects_by_patterns” function defined below.

paspailleur.algorithms.base_functions.intention(objects: bitarray, objects_per_pattern: dict[Pattern, bitarray]) → Pattern | None#

Compute the intent of a given set of objects.

Parameters:

objects (bitarray) – A bitarray representing selected objects.
objects_per_pattern (dict[Pattern, bitarray]) – A mapping from patterns to object bitarrays.

Returns:

intent – The most specific pattern shared by all objects, if any.

Return type:

Optional[Pattern]

Examples

>>> objects_patterns = [Pattern(frozenset('abc')), Pattern(frozenset('abc')), Pattern(frozenset('abcd')), Pattern(frozenset('acde'))]
>>> from paspailleur.algorithms import base_functions as bfuncs
>>> obj_to_patterns = bfuncs.group_objects_by_patterns(objects_patterns)
>>> obj_ba = bitarray('1100')
>>> bfuncs.intention(obj_ba, obj_to_patterns)
Pattern(frozenset({'a', 'b', 'c'}))

paspailleur.algorithms.base_functions.patternise_description(active_atoms: bitarray, atomic_patterns: list[Pattern], subatoms_order: list[bitarray], trusted_input: bool = False) → Pattern#

Reconstruct pattern from its atomic representation

The function runs the join operation on atomic_patterns indexed by active_atoms, but provides some optimisations using subatoms_order.

Important: The list of atomic_patterns should be topologically sorted. That is, for every pattern, all its subpatterns should have smaller indices.

Parameters:

active_atoms (bitarray) – Bitarray that represents the pattern-to-output in a binary format. That is, active_atoms[i] is True when atomic_patterns[i] is less precise than pattern-to-output. Should be the same length as the list of atomic_patterns.
atomic_patterns (list[Pattern]) – The list of all atomic patterns.
subatoms_order (list[bitarray]) – Partial order on the set of atomic patterns. Value subatoms_order[i][j] is True when atomic_patterns[i] is less precise than atomic_patterns[j].
trusted_input (bool, default=`False`) – A flag whether the list of atomic patterns is guaranteed to be topologically sorted. That is, if we know for sure, that for every pattern, all its subpatterns would have smaller indices.

Returns:

pattern – Pattern obtained by joining atomic_patterns selected by active_atoms.

Return type:

Pattern

paspailleur.algorithms.base_functions.minimal_pattern(objects_per_pattern: dict[Pattern, bitarray]) → Pattern#

Compute the minimal pattern across all object patterns.

Parameters:: objects_per_pattern (dict[Pattern, bitarray]) – Mapping from patterns to object bitarrays.
Returns:: minimal – The minimal pattern.
Return type:: Pattern

Examples

>>> objects_patterns = [Pattern(frozenset('abc')), Pattern(frozenset('abc')), Pattern(frozenset('abcd')), Pattern(frozenset('acde'))]
>>> from paspailleur.algorithms import base_functions as bfuncs
>>> obj_to_patterns = bfuncs.group_objects_by_patterns(objects_patterns)
>>> bfuncs.minimal_pattern(obj_to_patterns)
Pattern(frozenset({'a'}))

paspailleur.algorithms.base_functions.maximal_pattern(objects_per_pattern: dict[Pattern, bitarray]) → Pattern#

Compute the maximal pattern across all object patterns.

Parameters:: objects_per_pattern (dict[Pattern, bitarray]) – Mapping from patterns to object bitarrays.
Returns:: maximal – The maximal pattern.
Return type:: Pattern

Examples

>>> objects_patterns = [Pattern(frozenset('abc')), Pattern(frozenset('abc')), Pattern(frozenset('abcd')), Pattern(frozenset('acde'))]
>>> from paspailleur.algorithms import base_functions as bfuncs
>>> obj_to_patterns = bfuncs.group_objects_by_patterns(objects_patterns)
>>> bfuncs.maximal_pattern(obj_to_patterns)
Pattern(frozenset({'a', 'b', 'c', 'd', 'e'}))

paspailleur.algorithms.base_functions.group_objects_by_patterns(objects_patterns: list[Pattern]) → dict[Pattern, bitarray]#

Group objects by their associated patterns.

Parameters:: objects_patterns (list[Pattern]) – A list where each element corresponds to a pattern describing an object.
Returns:: objects_by_patterns – Dictionary mapping patterns to bitarrays indicating which objects correspond to them.
Return type:: dict[Pattern, bitarray]

Examples

>>> objects_patterns = [Pattern(frozenset('abc')), Pattern(frozenset('abc')), Pattern(frozenset('abcd')), Pattern(frozenset('acde'))]
>>> from paspailleur.algorithms import base_functions as bfuncs
>>> obj_to_patterns = bfuncs.group_objects_by_patterns(objects_patterns)
>>> bfuncs.group_objects_by_patterns(objects_patterns)
{Pattern(frozenset({'a', 'b', 'c'})): bitarray('1100'), Pattern(frozenset({'a', 'b', 'c', 'd'})): bitarray('0010'), Pattern(frozenset({'a', 'c', 'd', 'e'})): bitarray('0001')}

paspailleur.algorithms.base_functions.iter_patterns_ascending(patterns: list[Pattern] | OrderedDict[Pattern, Any], greater_patterns_ordering: list[bitarray], controlled_iteration: bool = False) → Generator[Pattern | tuple[Pattern, Any], bool, None]#

Iterate through patterns in ascending order of generalization.

Parameters:

patterns (Union[list[Pattern], OrderedDict[Pattern, Any]]) – List or ordered dict of patterns.
greater_patterns_ordering (list[bitarray]) – Ordering information of which patterns are greater.
controlled_iteration (bool, optional) – If True, allow step-wise iteration with external control (default is False).

Yields:

Generator[Union[Pattern, tuple[Pattern, Any]], bool, None] – Each pattern or pattern-value pair, controlled by input from send().

Examples

>>> objects_patterns = [Pattern(frozenset('abc')), Pattern(frozenset('abc')), Pattern(frozenset('abcd')), Pattern(frozenset('acde'))]
>>> from paspailleur.algorithms import base_functions as bfuncs
>>> obj_to_patterns = bfuncs.group_objects_by_patterns(objects_patterns)
>>> pattern_order = bfuncs.order_patterns_via_extents(list(obj_to_patterns.items()))
>>> for pattern in bfuncs.iter_patterns_ascending(list(obj_to_patterns), pattern_order):
    print(pattern)

paspailleur.algorithms.base_functions.rearrange_indices(order_before: list[bitarray], elements_before: list, elements_after: list) → list[bitarray]#

Rearrange a list of orderings after element reordering.

Parameters:

order_before (list[bitarray]) – The original ordering as bitarrays.
elements_before (list) – The elements before reordering.
elements_after (list) – The elements after reordering.

Returns:

order_after – Reordered list of bitarrays.

Return type:

list[bitarray]

Examples

>>> from paspailleur.algorithms import base_functions as bfuncs
>>> before = [bitarray('010'), bitarray('001'), bitarray('000')]
>>> elems_before = ['A', 'B', 'C']
>>> elems_after = ['C', 'A', 'B']
>>> bfuncs.rearrange_indices(before, elems_before, elems_after)
[bitarray('001'), bitarray('100'), bitarray('000')]

paspailleur.algorithms.base_functions.order_patterns_via_extents(patterns_extents: list[tuple[Pattern, frozenbitarray]], use_tqdm: bool = False) → list[bitarray]#

Generate the partial order of patterns based on their extents.

This function generates the partial order of patterns using extents for optimising the algorithm. It returns for each pattern a bitarray indicating which other patterns are more general.

Parameters:

patterns_extents (list of tuple[Pattern, fbarray]) – List of patterns and their associated extents.
use_tqdm (bool, optional) – If True, display a progress bar (default is False).

Returns:

patterns_order – A list of bitarrays representing the ordered patterns based on their extents.

Return type:

list[bitarray]

Examples

>>> objects_patterns = [Pattern(frozenset('abc')), Pattern(frozenset('abc')), Pattern(frozenset('abcd')), Pattern(frozenset('acde'))]
>>> from paspailleur.algorithms import base_functions as bfuncs
>>> obj_to_patterns = bfuncs.group_objects_by_patterns(objects_patterns)
>>> pattern_extents = list(obj_to_patterns.items())
>>> order = bfuncs.order_patterns_via_extents(pattern_extents)
>>> order[0]
bitarray('010')

paspailleur.algorithms.base_functions.iterate_antichains(descending_order: list[bitarray], max_length: int = None) → Generator[tuple[int, ...], bool, None]#

Iterate antichains of indices whose partial order is defined by descending_order parameter.

Antichain is a term from Order theory that represents a set of incomparable elements. That is, a subset of indices {i, j, k, …, n} makes an antichain when every pair of indices (i, j), (i, k), …, represents a pair of incomparable elements: e.g. descending_order[i][j] == descending_order[j, i] == False.

Important: Elements in descending_order should be lexicographically ordered. That is, for every i-th element, all its lesser elements should have lower indices: from 0 to i-1.

Parameters:

descending_order (list[bitarray]) – Defined the partial order of indices. Value descending_order[i][j]==True indicates that i-th element is greater than the j-th element.
max_length (int, default = len(descending_order)) – Maximal length of an antichain to yield.

Yields:

antichains_iterator (Generator[list[int], bool, None]) – Generator of antichains of the partial order defined by descending_order. The navigation can be controlled using boolean value into antichains_iterator.send(). If the passed value is True then the generator will pass through dominating antichains.

Examples

Use the function as a generator: >>> descending_order = [bitarray(‘0000’), bitarray(‘0100’), bitarray(‘1000’), bitarray(‘0000’)] >>> list(iterate_antichains(descending_order)) # get the list of all possible antichains [(), (0,), (1,), (1, 0), (2,), (2, 1), (3,), (3, 0), (3, 1), (3, 1, 0), (3, 2), (3, 2, 1)]

Control the navigation over antichains: >>> descending_order = [bitarray(‘0000’), bitarray(‘0100’), bitarray(‘1000’), bitarray(‘0000’)] >>> iterator = iterate_antichains(descending_order) >>> iterator.send(None) # send None value to get the first antichain which is the empty tuple () >>> iterator.send(True) # get the next antichain while saying True for antichains that dominate the empty antichain (0,) >>> iterator.send(False) # get the next antichain while forbidding any antichain that has elements greater than the 0th (1,) >>> list(iterator) # generate all antichains that are left to iterate [(3,), (3, 1)]

The second iteration method has omitted all antichains that contain element 0. It has also skipped all antichains that contain element 2, because the 2nd element is defined as greater than the 0th: descending_order[2][0]==True.

Mine equivalence classes#

paspailleur.algorithms.mine_equivalence_classes.list_intents_via_Lindig_complex(data: list, pattern_structure) → list[PatternDescription]#

Get the list of intents of pattern concepts from data described by pattern_structure running Lindig algorithm from “Fast Concept Analysis” by Christian Lindig, Harvard University, Division of Engineering and Applied Sciences

WARNING: The function does not work at the moment as it was written for the outdated version of PatternStructure code architecture.

Parameters:

data – list of objects described by a pattern structure
pattern_structure – type of pattern structure related to data

Returns:

list of intents of pattern concepts

Return type:

Lattice_data_intents

paspailleur.algorithms.mine_equivalence_classes.iter_intents_via_ocbo(objects_patterns: list[Pattern]) → Iterator[tuple[Pattern, bitarray]]#

Iterate intents by applying the object-wise Close By One algorithm.

References

Kuznetsov, S. O. (1993). A fast algorithm for computing all intersections of objects from an arbitrary semilattice. Nauchno-Tekhnicheskaya Informatsiya Seriya 2-Informatsionnye Protsessy i Sistemy, (1), 17-20.

Parameters:: objects_patterns (list[Pattern]) – List of patterns, one per object.
Returns:: intent_extent_pairs – Yields each pattern (intent) and its extent.
Return type:: Iterator[tuple[Pattern, bitarray]]

paspailleur.algorithms.mine_equivalence_classes.iter_all_patterns_ascending(atomic_patterns_extents: OrderedDict[Pattern, bitarray], min_support: int = 0, depth_first: bool = True, controlled_iteration: bool = False) → Generator[tuple[Pattern, bitarray], bool, None]#

Iterate all patterns in ascending order of precision using atomic patterns.

Parameters:

atomic_patterns_extents (OrderedDict[Pattern, bitarray]) – Atomic patterns and their extents.
min_support (int, optional) – Minimum support for yielded patterns.
depth_first (bool, optional) – Whether to use depth-first traversal (default True).
controlled_iteration (bool, optional) – If True, allows caller to control traversal.

Returns:

pattern_extent_stream – Yields each pattern and its extent.

Return type:

Generator[tuple[Pattern, bitarray], bool, None]

Examples

>>> from paspailleur.pattern_structures.built_in_patterns import ItemSetPattern
>>> atomic_patterns_extents = OrderedDict([
...    (ItemSetPattern({'A'}), bitarray('1110')),
...    (ItemSetPattern({'B'}), bitarray('1101')),
...    (ItemSetPattern({'C'}), bitarray('1011'))
... ])

— Non-controlled iteration — >>> for p, e in mec.iter_all_patterns_ascending(atomic_patterns_extents): … print(p, e)

— Controlled iteration — >>> gen = mec.iter_all_patterns_ascending(atomic_patterns_extents, controlled_iteration=True) >>> next(gen) # initialize >>> refine_pattern = True >>> while True: … try: … pattern, extent = gen.send(refine_pattern) # control exploration … except StopIteration: … break

paspailleur.algorithms.mine_equivalence_classes.list_stable_extents_via_gsofia(atomic_patterns_iterator: Generator[tuple[Pattern, frozenbitarray], bool, None], min_delta_stability: int = 0, n_stable_extents: int = None, min_supp: int = 0, use_tqdm: bool = False, n_atomic_patterns: int = None) → set[frozenbitarray]#

Identify stable extents using the gSofia algorithm.

References

Efficient Mining of Subsample-Stable Graph Patterns by Aleksey Buzmakov; Sergei O. Kuznetsov; Amedeo Napoli. Published in: 2017 IEEE International Conference on Data Mining (ICDM)

Parameters:

atomic_patterns_iterator (Generator) – Generator yielding atomic patterns and their extents.
min_delta_stability (int, optional) – Minimum delta stability to accept an extent.
n_stable_extents (int, optional) – Maximum number of stable extents to return.
min_supp (int, optional) – Minimum support required for an extent.
use_tqdm (bool, optional) – Whether to show progress bar.
n_atomic_patterns (int, optional) – Number of atomic patterns expected.

Returns:

stable_extents – Set of stable extents.

Return type:

set[fbarray]

Notes

The extents returned with n_stable_extents parameter are not necessarily the n most stable extents. They are just n extents that seem to be the very stable.

paspailleur.algorithms.mine_equivalence_classes.iter_keys_of_pattern(pattern: Pattern, atomic_patterns: OrderedDict[Pattern, frozenbitarray], max_length: int | None = None) → Iterator[Pattern]#

Yield key patterns that generate the same extent as the given pattern.

Parameters:

pattern (Pattern) – The target pattern.
atomic_patterns (OrderedDict[Pattern, fbarray]) – Atomic patterns and their extents.
max_length (Optional[int], optional) – Maximum length of key patterns.

Returns:

keys – Iterator of key patterns.

Return type:

Iterator[Pattern]

paspailleur.algorithms.mine_equivalence_classes.iter_keys_of_patterns(patterns: list[Pattern], atomic_patterns: OrderedDict[Pattern, frozenbitarray], max_length: int | None = None) → Iterator[tuple[Pattern, int]]#

Yield key patterns for a list of patterns, maintaining index association.

atomic_patterns should be sorted in topological order. So every i-th atomic pattern should be not-smaller than any previous (1, 2, …, i-i) atomic pattern

Parameters:

patterns (list[Pattern]) – List of patterns to generate keys for.
atomic_patterns (OrderedDict[Pattern, fbarray]) – Atomic patterns and their extents.
max_length (Optional[int], optional) – Maximum key length.

Returns:

keys_with_index – Iterator of (key, original pattern index) tuples.

Return type:

Iterator[tuple[Pattern, int]]

paspailleur.algorithms.mine_equivalence_classes.iter_keys_of_patterns_via_atoms(patterns: list[tuple[Pattern, frozenbitarray]], atomic_patterns: OrderedDict[Pattern, frozenbitarray], subatoms_order: list[frozenbitarray] = None, max_length: int = None, use_tqdm: bool = False) → Iterator[tuple[Pattern, int]]#

Yield the least precise patterns (aka keys) that describe the same extent as patterns

Parameters:

patterns (list[tuple[Pattern, fbarray]]) – A list of target patterns and their extents
atomic_patterns (OrderedDict[Pattern, fbarray]) – Atomic patterns (and their extents) that will be used for finding keys.
subatoms_order (list[fbarray], optional) – Partial order of atomic_patterns represented with list of frozenbitarrays. The value subatoms_order[i][j] == True means that j-th atomic pattern is less precise than i-th atomic pattern. If the value is not provided (i.e. equals to None), then the partial order will be computed inside this function.
max_length (Optional[int], default = len(atomic_patterns)) – Maximum number of atomic pattern that a key can consist of. This parameter can be used for “early-stopping” to avoid generating too complex keys.
use_tqdm (bool, default = False) – Flag whether to show tqdm progress bar or not.

Yields:

key (Pattern) – One of the least precise patterns that describe the same extent as i-th provided pattern (identified by pattern_index)
pattern_index (int) – Index of the provided pattern described by key

paspailleur.algorithms.mine_equivalence_classes.iter_intents_via_cboi(atomic_patterns: OrderedDict[Pattern, frozenbitarray], superatoms_order: list[frozenbitarray], min_support: int = 0, yield_pattern_intents: bool = True) → Iterator[tuple[Pattern | frozenbitarray, frozenbitarray]]#

Iterate pattern concepts using algorithm Close-by-One-with-Implications

The original CbOI algorithm was described in the language of attribute implications in (Belfodil et al., 2019). This implementation describes essentially the same algorithm, but uses the language of partial order on attributes.

Parameters:

atomic_patterns (OrderedDict[Pattern, frozenbitarray]) – Mapping from atomic patterns to what objects they describe. The latter is represented with its characteristic vector stored as a frozenbitarray. So atomic_patterns[p][i] == True means that atomic pattern p describes i-th object.
superatoms_order (list[frozenbitarray]) – Partial order on atomic patterns. For every i-th atomic pattern, it shows the indices of all greater atomic patterns. The partial order should be topologically sorted: for every i-th atomic pattern, all greater patterns should have greater indices.
min_support (int, default = 0) – Minimal number of objects that a concept should describe.
yield_pattern_intents (bool, default = True) – Flag whether to yield concept’s intent as Pattern or as a frozenbitarray, whose True elements corresponds to atomic patterns of the Pattern.

Yields:

extent (fbarray) – Concept’s extent, i.e. the maximal subset of objects described by concept’s intent.
intent (Pattern or fbarray) – Concept’s intent. If yield_pattern_intents == True then represent intent as the actual Pattern. If yield_pattern_intents == False then represent intent with frozenbitarray describing indices of all atomic patterns that are less precise than the pattern. (Then the actual pattern can be obtained as a Pattern.join of all listed atomic patterns).

References

Belfodil, A., Belfodil, A., & Kaytoue, M. (2019, May). Mining Formal Concepts using Implications between Items. In International Conference on Formal Concept Analysis (pp. 173-190). Cham: Springer International Publishing.

Mine subgroups#

paspailleur.algorithms.mine_subgroups.setup_quality_measure_function(quality_measure: Literal['Accuracy', 'Precision', 'Recall', 'Jaccard', 'F1', 'WRAcc'], quality_threshold: float, n_positives: int, n_objects: int) → tuple[Callable[[int, int], float], int, int]#

Set up a quality measure function based on a specified metric and threshold.

Parameters:

quality_measure (Literal) – The metric to use for subgroup quality evaluation.
quality_threshold (float) – The minimum acceptable value for the quality function.
n_positives (int) – Number of positive (goal) objects.
n_objects (int) – Total number of objects.

Returns:

quality_setup – A tuple of (quality function, minimum true positives, maximum false positives).

Return type:

tuple

paspailleur.algorithms.mine_subgroups.iter_subgroups_bruteforce(pattern_structure, goal_objects: bitarray, quality_threshold: float, quality_func: Callable[[int, int], float], tp_min: int = None, fp_max: int = None, max_pattern_length: int = None)#

Find less precise patterns that describe goal objects with sufficient quality via brute-force.

Important

The algorithm does not replicate any existing Subgroup Discovery algorithm (at least, intentionally). It should make the job done due to its greediness, but it might be well behind the State-of-the-Art algorithms.

Parameters:

pattern_structure (PatternStructure) – The pattern structure to mine from.
goal_objects (bitarray) – Bitarray indicating which objects are goal/positive.
quality_threshold (float) – Minimum acceptable value for the quality function.
quality_func (Callable[[int, int], float]) – A function that computes the quality based on true and false positives.
tp_min (int, optional) – Minimum number of true positives.
fp_max (int, optional) – Maximum number of false positives.
max_pattern_length (int, optional) – Maximum allowed length of a pattern.

Returns:

subgroups – Yields qualifying patterns, their extents, and quality scores.

Return type:

Iterator[tuple[Pattern, bitarray, float]]

paspailleur.algorithms.mine_subgroups.iter_subgroups_via_atoms(atomic_patterns: OrderedDict[Pattern, bitarray], goal_objects: bitarray, quality_threshold: float, quality_func: Callable[[int, int], float], tp_min: int = None, max_subgroup_length: int = None, subatoms_order: list[bitarray] = None, use_tqdm: bool = False) → Iterator[tuple[Pattern, frozenbitarray, float]]#

Mine patterns that describe given goal_objects good-enough w.r.t. the quality_func and quality_threshold.

The mined patterns are the least precise patterns whose quality_func value is higher than the quality_threshold.

Such patterns can also be called “subgroups” when related to Subgroup Discovery field. The algorithm implemented in this function is rather “bruteforce” and only uses a smart atomic patterns “antichain traversal” as optimisation. Therefore, it can be much slower than the State-of-the-Art algorithms of Subgroup Discovery.

Parameters:

atomic_patterns (OrderedDict[Pattern, bitarray]) – Ordered Dictionary of atomic patterns and their extents (represented with bitarrays). Every yielded pattern is a join of a subset of atomic patterns. The dictionary should be Ordered in order to reflect the specificity order of the atomic patterns. That is, the less precise atomic patterns should be placed in the “beginning” of the dictionary, the more precise patterns should be placed in the “end” of the dictionary, and every atomic pattern should be placed after all its smaller atomic patterns.
goal_objects (bitarray) – A subset of objects to find a pattern to. Should be represented with a bitarray where i-th element equals to True when i-th object belongs to the set of “goal” objects.
quality_threshold (float) – The minimal bound when a pattern can be considered good enough and be yielded by the function. If a pattern is considered good enough, none of its more precise patterns will be tested for their quality.
quality_func (Callable[[int, int], float]) – A function to evaluate the quality of a pattern. The function should follow a specific interface: it takes the number of true-positive and false-positive objects described by a pattern, and outputs the score value. The greater the score is, the better fitted is the pattern. Examples of such quality functions can be generated using function paspailleur.algorithms.mine_subgroups.setup_quality_measure_function.
tp_min (int, optional) – Minimal number of true positives that a pattern should describe. When provided, this value helps to leave out patterns with too small extents. When not provided, it is considered to be 0.
max_subgroup_length (int, default = len(atomic_patterns)) – The maximal number of atomic patterns that can be joined together to form a pattern. When provided, this value helps to leave out patterns that consist of too many atomic patterns, so the patterns that are deemed to be “too complex”.
subatoms_order (list[bitarray], optional) – Subatoms order on atomic patterns from atomic_patterns represented with list of bitarrays. The value subatoms_order[i][j] should equal True when j-th atomic pattern is less precise than i-th atomic pattern: subatoms_order[i][j] == list(atomic_patterns)[j] <= list(atomic_patterns)[i]. When not provided, all values of subatoms_order are computed inside the function.
use_tqdm (bool, optional) – A flag whether to use tqdm progress bar to track the number of patterns that was processed by the function. Defaults to False.

Yields:

subgroup (Pattern) – A pattern that describes the goal_objects.
extent (frozenbitarray) – A set of objects described by subgroup represented with a frozenbitarray.
score (float) – The value of quality_func for subgroup.

Mine implications#

paspailleur.algorithms.mine_implication_bases.iter_proper_premises_from_atomised_premises(premise_extent_iterator: Iterable[tuple[bitarray, bitarray]], minsup_atomic_patterns: OrderedDict[Pattern, bitarray], minsup_subatoms_order: list[bitarray], maxsup_atomic_patterns: OrderedDict[Pattern, bitarray], maxsup_subatoms_order: list[bitarray] = None, yield_patterns: bool = True, reduce_conclusions: bool = False) → Iterator[tuple[Pattern, Pattern] | tuple[bitarray, bitarray]]#

Iterate proper premises and their conclusion based on the premise candidates represented with indices of their atoms

Important: The sets of minsup_atomic_patterns, maxsup_atomic_patterns, premises in premise_extent_iterator have to be topologically sorted. That is, the greater the atomic pattern, the greater index it should have.

Parameters:

premise_extent_iterator (Iterable[tuple[bitarray, bitarray]]) – Pairs of premise candidates and their extents. The indices of True elements in premise candidates correspond to atomic patterns in minsup_atomic_patterns.
minsup_atomic_patterns (OrderedDict[Pattern, bitarray]) – Support-minimal atomic patterns and their extents. Atomic pattern is support-minimal when all smaller atomic patterns describe more objects.
minsup_subatoms_order (list[bitarray]) – Partial order on support-minimal atomic patterns. Value minsup_subatoms_order[i][j] is True when j-th sup.min. atomic pattern is smaller than the i-th one. The order should be topologically sorted, that is the greater patterns should have greater indices.
maxsup_atomic_patterns (OrderedDict[Pattern, bitarray]) – Support-maximal atomic patterns and their extents. Value maxsup_subatoms_order[i][j] is True when j-th sup.max. atomic pattern is smaller than the i-th one. Atomic pattern is support-maximal when all greater atomic patterns describe fewer objects.
maxsup_subatoms_order (list[bitarray]) – Partial order on support-maximal atomic patterns.
yield_patterns (bool, default True) – Flag whether to output proper premises and their conclusions as Patterns, or a bitarrays.
reduce_conclusions (bool, default False) – Flag whether to output the reduced conclusion for each premise (to not repeat the conclusions of other premises) or the full conclusion.

Returns:

premise (Pattern or bitarray) – Proper premise represented as Pattern (when yield_patterns is True) or as a bitarray that references minsup_atomic_patterns.
conclusion (Pattern or bitarray) – Conclusion represented as Pattern (when yield_patterns is True) or as a bitarray that references maxsup_atomic_patterns. When reduce_conclusions is True, output only the part of the conclusion that cannot be deduced from other implications.

paspailleur.algorithms.mine_implication_bases.iter_pseudo_intents_from_atomised_premises(premises: Iterable[bitarray], atomic_patterns: OrderedDict[Pattern, bitarray], subatoms_order: list[bitarray], yield_patterns: bool = True, reduce_conclusions: bool = False) → Iterator[tuple[Pattern, Pattern] | tuple[bitarray, bitarray]]#

Iterate pseudo intents and their conclusion based on the premise candidates represented with indices of their atoms

Important: The sets of atomic_patterns, have to be topologically sorted. That is, the greater the atomic pattern, the greater index it should have.

Parameters:

premises (Iterable[bitarray]) – List of premises to convert into pseudo-intents. The indices of True elements in premise candidates correspond to atomic patterns in atomic_patterns.
atomic_patterns (OrderedDict[Pattern, bitarray]) – Atomic patterns and their extents. Dictionary should contain both support-minimal and support-maximal patterns..
subatoms_order (list[bitarray]) – Partial order on atomic patterns. Value subatoms_order[i][j] is True when j-th atomic pattern is smaller than the i-th one. The order should be topologically sorted, that is the greater patterns should have greater indices.
yield_patterns (bool, default True) – Flag whether to output proper premises and their conclusions as Patterns, or a bitarrays.
reduce_conclusions (bool, default False) – Flag whether to output the reduced conclusion for each premise (to not repeat the conclusions of other premises) or the full conclusion.

Returns:

premise (Pattern or bitarray) – Proper premise represented as Pattern (when yield_patterns is True) or as a bitarray that references minsup_atomic_patterns.
conclusion (Pattern or bitarray) – Conclusion represented as Pattern (when yield_patterns is True) or as a bitarray that references maxsup_atomic_patterns. When reduce_conclusions is True, output only the part of the conclusion that cannot be deduced from other implications.

Algorithms API#

Base functions#

Mine equivalence classes#

Mine subgroups#

Mine implications#

This Page