Skip to content

Anchoring normalization results in empty character range (from==to) #84

@danielhers

Description

@danielhers

When normalizing graph 20010012 for EDS, node 5 is originally anchored to character 19 (from=19, to=20), but after normalization, the anchor sequence is empty (from=to=20). This seems to be a bug in the anchoring normalization code:

mtool/graph.py

Lines 85 to 112 in eda24cf

def union(anchors):
characters = set();
for anchor in anchors:
if "from" in anchor and "to" in anchor:
for i in range(anchor["from"], anchor["to"]):
characters.add(i);
result = [];
last = start = None;
for i in sorted(characters):
if start is None: start = i;
if last is None:
last = i;
continue;
elif i == last + 1 \
or all(c in score.core.SPACE for c in input[last:i]):
last = i;
continue;
else:
result.append({"from": start, "to": last + 1});
last = start = i;
if len(characters) > 0:
result.append({"from": start, "to": i + 1});
if anchors != result:
old = [anchor for anchor in anchors if anchor not in result];
new = [anchor for anchor in result if anchor not in anchors];
print("{} ==> {} [{}]".format(old, new, input),
file = sys.stderr);
return result;

The print in line 110 shows when running it, printing:
[{'from': 20, 'to': 20}] ==> [] [Then, in the guests' honor, the speedway hauled out four drivers, crews and even the official Indianapolis 500 announcer for a 10-lap exhibition race.]

I think it might be because the apostrophe character, which comprises the whole anchor, is considered a space character.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions