pdfHandler module¶
-
class
pdfHandler.CharData(character, bb0, bb1, bb2, bb3)¶ Bases:
objectHold properties of characters, which are extracted from the pdf data.
-
class
pdfHandler.PageData¶ Bases:
objectHold the data of each page including sentences and characters.
-
addChar(c)¶ Add character data to the list.
- Parameters
c – Added character.
- Returns
None
-
addSentence(s)¶ Add sentence data to the list.
- Parameters
s – Added sentence.
- Returns
None
-
-
class
pdfHandler.PdfHandler(pdfPath)¶ Bases:
objectHandle whole pdf data.
-
generateHighlightedPdf()¶ Generate highlighted pdf with respect to each color and annotating text of it.
- Returns
None
-
getSentence()¶ Gets all sentences of the pdf data.
- Returns
Whole sentences.
-
makeSentence()¶ Make sentences from extracted characters.
- Returns
None
-
textExtracWithCoord()¶ Extract each character from pdf data. The character and its coordinates are extracted.
- Returns
None
-
-
class
pdfHandler.SentenceData(sentence, rectList, pageNum)¶ Bases:
object-
setAnnotation(annotText)¶ Set annotate text for pdf. The text will be annotated in the pdf data.
- Parameters
annotText – Annotate text.
- Returns
None
-
setColor(color)¶ Set the annotation color for the sentence.
- Parameters
color – The color
- Returns
None
-
setRectList(offset)¶ Define original coordinate for pdf.
- Parameters
offset – The offsets from original coordinate.
- Returns
None
-