2

I have a custom entity ruler added to the spacy "en_core_web_sm" model. I want to add or remove entities in it when needed. This question has already been answered here, however I believe that is not correct as the person is talking about the ner component not the entity ruler.
Short version of that answer is that Spacy tends to forget previous patterns when you add new ones.
However that only happens when you are training the model's ner compenent with examples. Entity ruler is not trained on examples, it is simply given the patterns and labels to match and it has worked for me perfectly (I added it after the parser component).
If I'm wrong please correct me and if I'm right then how do I add/delete entities in the entity ruler (patterns and labels both or separately, whatever is possible).

def custom_ruler(file_path):
    ruler = nlp.add_pipe('entity_ruler', after='parser')
    ruler.from_disk(file_path)

This function is given a jsonl file that contains the entities.

2 Answers 2

2

Just like Polm stated, you can add your own rule-based patterns to overwrite the model. To do this, just create a list of dictionaries that have two pairs:

  • pattern defines what exactly it should overwrite;
  • labeldefines what label to give to the entity instead. When you add those patterns to your ruler with add_patterns() method, the nlp Doc object gets automatically updated, and you could enumerate through its named entities and see the corrections in play.

    #Patterns is a list of dictionaries of the token text and the expected label.
    patterns = [{"label": "NATION", "pattern": "Maya"}, 
        {"label": "NATION", "pattern": "Aztecs"},
        {"label": "DATE", "pattern": "BCE"}]
    #Ruler is the English entity recogniser.
    ruler = english.add_pipe("entity_ruler", before="ner")
    ruler.add_patterns(patterns)
    #We can now see the new entities in the text.
    chocolate = english(source)
    for entity in chocolate.ents:
        print(entity.text, entity.label_)

You can learn more about the spaCy NLP library from my showcasing notebook or take a look at this Youtube tutorial. What I got

1

You can add items to the entity ruler as usual.

ruler = nlp.get_pipe("entity_ruler")
patterns = ... whatever your patterns are ...
ruler.add_patterns(patterns)

See the Entity Ruler docs. See the API docs for examples of removal.

Not the answer you're looking for? Browse other questions tagged or ask your own question.