Removing frames (text boxes) from a word document, after OCR or saving as rtf from pdf document

spiros · 3 · 41599

spiros

  • Administrator
  • Hero Member
  • *****
    • Posts: 836752
    • Gender:Male
  • point d’amour
Removing frames (text boxes) from a word document, after OCR or saving as rtf from pdf document

You saved or scanned a document with OCR software like Abbyy FineReader or OmniPage Pro? You saved as rtf a PDF document and the resultant word document, contains multiple frames?

Frames make the document very hard to edit because all text is placed inside frames. We need to remove those frames if we want to edit the document.

How do we do that?

If you do not care about formatting you do this:

1.
—Open the file which has frames in MS Word
—Save the file as a Plain text file.
—Open the new text file you have just saved in Notepad or WordPad or some other text editor.
—Now Select all the text by pressing Ctrl+A, Copy and paste that into a New MS Word file. Then Save it with any name you want. Frames are gone.

If you do care about formatting:

2.
—Copy everything in the Word document, paste all the text into WordPad, copy all the text in the WordPad document, and paste it back into the Word document.

3.
—Select the entire document by pressing Ctrl+A, and then press Ctrl+Q. This will set every paragraph back to its default condition and most likely remove the frames.


4.

Use a macro to remove text boxes and delete text

Code: [Select]
Sub DeleteTextBoxesAndText()
Dim oShp As Word.Shape
Dim i As Long
For i = ActiveDocument.Shapes.Count To 1 Step -1
Set oShp = ActiveDocument.Shapes(i)
If oShp.Type = msoTextBox Then
oShp.Delete
End If
Next i
End Sub

5.

Use a macro to remove text boxes but keep text

Code: [Select]
Sub RemoveTextBox2()
    Dim shp As Shape
    Dim oRngAnchor As Range
    Dim sString As String

    For Each shp In ActiveDocument.Shapes
        If shp.Type = msoTextBox Then
            ' copy text to string, without last paragraph mark
            sString = Left(shp.TextFrame.TextRange.Text, _
              shp.TextFrame.TextRange.Characters.Count - 1)
            If Len(sString) > 0 Then
                ' set the range to insert the text
                Set oRngAnchor = shp.Anchor.Paragraphs(1).Range
                ' insert the textbox text before the range object
                oRngAnchor.InsertBefore _
                  "Textbox start << " & sString & " >> Textbox end"
            End If
            shp.delete
        End If
    Next shp
End Sub

6.

Use a macro to remove frames

Code: [Select]
Sub RemoveFrames()
    Dim aFrame As Frame
    Dim p As Paragraph
    Dim l As Single

    For Each aFrame In ActiveDocument.Frames
       aFrame.RelativeHorizontalPosition = wdRelativeHorizontalPositionPage
       l = aFrame.HorizontalPosition
       For Each p In aFrame.Range.Paragraphs
          p.LeftIndent = l
       Next p
       aFrame.Delete
    Next aFrame
End Sub

7.

Use a macro to remove text boxes but keep text (commercial tool, free trial)

Quickly remove all text boxes and keep texts in Word

and a macro for frames by the same tool:
Quickly remove all frames and keep text from document in Word
« Last Edit: 29 Oct, 2014, 23:09:50 by spiros »


amol

  • Semi-Newbie
  • *
    • Posts: 1
    • Gender:Male
hi,
 your options regarding removing frames of rtf is not working. if u copy the whole document and paste it into wordpad  the whole alignment is missing every word will get left aligned and carriage return is inserted after each word. Same thing for pressing ctrl+A and then Ctrl+Q
Given below is the example. By the way i am exporting rtf from crystal report 8.5 and again trying to open the same in word/wordpad for editing. Given below are the fields from file in whilch values are placed at runtime.


word rtf

Pt.Name    : Mr. A                                                             Age    :     Yrs.          Sex :  M

Ref. By      :Dr. A. A. PANGI                                               Date   :  17/04/2013
                            MD (MED)

                                   URINE PREGNANCY TEST
Specimen : Early morning urine sample.
Method    : Membrane Immunoassay
Result      : hCG (Human Chorionic Gonadotropin ) PRESENT in urine.
Impression : POSITIVE FOR PREGNANCY



notepad rtf

Age    :     Yrs.          Sex :  M
Pt.Name    : Mr. A
Date   :
Ref. By      :Dr. A. A. PANGI
17/04/2013
MD (MED)
URINE PREGNANCY TEST
Specimen
:
Early morning urine sample.
Method
:
Membrane Immunoassay
Result
:
hCG (Human Chorionic Gonadotropin ) PRESENT in urine.
Impression
:
POSITIVE FOR PREGNANCY


Thank u

Amol



spiros

  • Administrator
  • Hero Member
  • *****
    • Posts: 836752
    • Gender:Male
  • point d’amour
This is because text boxes in your case are used to position text. There is no way you can have the same position and get rid of the text boxes as the text boxes are the ones defining the position.


 

Search Tools