cancel
Showing results for 
Search instead for 
Did you mean: 

Extracting Address help

Victor_Rodrigue
Star Contributor
Star Contributor

What would be the best way to extract the address below into three separate address line fields?  Right now i'm capturing the whole address and its outputting to a single string.  Thank you!

da815e36357441f792a967188a44fabd

 

>>Field Recognition: field ID (391) raw result = '815 N 52ND ST #1999 PHOENIX, AZ 85008', suspect level = 40

cb3a8d6fb7ec43a48496bc0c4175ae69

1 ACCEPTED ANSWER

Steve_Reed
Employee
Employee

Hi Victor,

Beginning with Foundation EP1, the 'First line only' type of extraction has been enhanced to allow for any specific line(s) of a list to be extracted...which would allow you to specify three overlapping zones where line 1 is extracted to keyword type Address 1, line 2 to keyword type Address 2 and line 3 to keyword type Address 3.  If upgrading to EP1 is not an option for you, then you could also accomplish this by creating a single zone as you have it in the screen shot you included, although set the keyword type on the zone to be Address 1, and have the script break up the three lines into three separate lines, then set line 2 and line 3 to Address 2 and Address 3 in the script, and only return line 1 back to the engine so it will set Address 1 as the result...something along the lines of:

Sub KeyFieldMain

Dim resultline

resultline = OCRDoc.RawTextLine (vbTrue, 0)  ' this is the first line only, what we will return for the zone itself

OCRDoc.CreateNewResultByName "Address Line 2", OCRDoc.RawTextLine (vbTrue, 1), 0, 0, 99  ' this is line 2 being placed into the keyword called "Address Line 2" - modify the keyword type name as necessary

OCRDoc.CreateNewResultByName "Address Line 3", OCRDoc.RawTextLine (vbTrue, 2), 0, 0, 99 ' line 3

OCRDoc.FieldText = resultline  ' only want line 1 to be returned, so the engine sets it for the zone, which is "Address Line 1"

End Sub

 

View answer in original post

3 REPLIES 3

Steve_Reed
Employee
Employee

Hi Victor,

Beginning with Foundation EP1, the 'First line only' type of extraction has been enhanced to allow for any specific line(s) of a list to be extracted...which would allow you to specify three overlapping zones where line 1 is extracted to keyword type Address 1, line 2 to keyword type Address 2 and line 3 to keyword type Address 3.  If upgrading to EP1 is not an option for you, then you could also accomplish this by creating a single zone as you have it in the screen shot you included, although set the keyword type on the zone to be Address 1, and have the script break up the three lines into three separate lines, then set line 2 and line 3 to Address 2 and Address 3 in the script, and only return line 1 back to the engine so it will set Address 1 as the result...something along the lines of:

Sub KeyFieldMain

Dim resultline

resultline = OCRDoc.RawTextLine (vbTrue, 0)  ' this is the first line only, what we will return for the zone itself

OCRDoc.CreateNewResultByName "Address Line 2", OCRDoc.RawTextLine (vbTrue, 1), 0, 0, 99  ' this is line 2 being placed into the keyword called "Address Line 2" - modify the keyword type name as necessary

OCRDoc.CreateNewResultByName "Address Line 3", OCRDoc.RawTextLine (vbTrue, 2), 0, 0, 99 ' line 3

OCRDoc.FieldText = resultline  ' only want line 1 to be returned, so the engine sets it for the zone, which is "Address Line 1"

End Sub

 

Victor_Rodrigue
Star Contributor
Star Contributor

Hi Steve,   Thanks for replying so quickly....option 2 worked beautifully if all 3 lines are populated.  i tested on a form that had no address info on Line 2.  The script is moving Line 3 data into the Address Line 2 keyword.  Do you know if there's any way to keep that from happening?  

c0e897a72f6a47cb8fb6b10f566a2bd3

>>Field Recognition: field ID (391) raw result = '8249 VIRGINIA AVE APT XX KANSAS CITY, MO 64131', suspect level = 50
>>Field Recognition: Preparing to execute Advanced Capture VB script #187
>>Field Recognition: Finished execution VB script #187, running time 0.0160 seconds
>>Field Recognition: field processing time on page 1 = 0.2820 seconds
>>Field Recognition: successfully processed 1 pages
>>Field Recognition: total field processing time for all pages = 0.2820 seconds
>>Context Completion: document type remains unchanged from original
>>Context Completion: keyword type Address (584), value = '8249 VIRGINIA AVE APT XX'
>>Context Completion: keyword type Address Line 2 (819), value = 'KANSAS CITY, MO 64131'
>>Context Completion: keyword type Address Line 3 (820), value = ''

Perhaps try modifying the script as follows:

If OCRDoc.RawTextLineCount (vbTrue) = 3 Then

  ' here do what the script did before, set keywords 2 and 3, return line 1 to the engine

ElseIf OCRDoc.RawtextLineCount (vbTrue) = 2 Then

' here only set keyword *3* to line 2 (which will actually be line 3 of the data), still return line 1

End If