I have a .docx
file that contains mcqs which are in the format as shown below. The problem is there are many duplicate mcqs and I would therefore like to know if a regex can be created to detect all duplicate mcqs.
I have Edit Pad Pro 7,Notepad++,powergrep and sublime text. and all the regex that I have used until now deleted duplicates on a line by line basis, thereby deleting options from other questions even though the questions don't match.
So basically what I am saying is I need a regex that can delete all the duplicate mcqs only if the whole mcq matches, not individul lines or sentences.
I am a novice with respect to regex, so please excuse any inadequacies.
Lichen planus occurs most frequently on the?
A. buccal mucosa.
B. tongue.
C. floor of the mouth.
D. gingiva.
In the absence of “Hanks balanced salt solution”, what is the most appropriate media to transport an avulsed tooth?
A. Saliva.
B. Milk.
C. Saline.
D. Tap water.
Which of the following is the most likely cause of osteoporosis, glaucoma, hypertension and peptic ulcers in a 65 year old with Crohn’s disease?
A. Uncontrolled diabetes.
B. Systemic corticosteroid therapy.
C. Chronic renal failure.
D. Prolonged NSAID therapy.
E. Malabsorption syndrome.
Lichen planus occurs most frequently on the?
A. buccal mucosa.
B. tongue.
C. floor of the mouth.
D. gingiva.
expected result
Lichen planus occurs most frequently on the?
A. buccal mucosa.
B. tongue.
C. floor of the mouth.
D. gingiva.
In the absence of “Hanks balanced salt solution”, what is the most appropriate media to transport an avulsed?
A. Saliva.
B. Milk.
C. Saline.
D. Tap water.
Which of the following is the most likely cause of osteoporosis, glaucoma, hypertension and peptic ulcers in a 65 year old with Crohn’s disease?
A. Uncontrolled diabetes.
B. Systemic corticosteroid therapy.
C. Chronic renal failure.
D. Prolonged NSAID therapy.
E. Malabsorption syndrome.