1

I'm trying to extract some numbers from a few cells that each have a large body of text.

The number strings are accompanied by certain words that precede the number string I'm trying to extract.

I've tried solving the problem by using functions like MID, LEFT, RIGHT, LEN, FIND and SEARCH However I keep ending up with the wrong result.

This is due to three issues with the bodies of text:

  • The first issue is that the words that precede it are different for each cell. This would make the formula highly complex and in need of IF, OR, or AND functions.
  • The second issue issue is that the number string I'm trying to extract has a varying length between 7 and 10 numbers long
  • The third issue is that the number only string I'm trying to extract isn't the only number string in the body text of the cells

The solution I currently have adds other characters other than numbers ;such as spaces, comma's, and brackets if the number string is less than 10 numbers long.

So basically I want to know if there is a way to extract the first number only string that is between 7-10 characters long from the body of text? Preferably with formula's only but VBA is a possibility as well.


Figured I'd edit the OP with the data examples since I haven't received a reply yet.

An example of the data I'm trying to manipulate can be found here: https://www.sendspace.com/file/f7kn6n


Since I haven't received a response in a while I figured I would update with a screenshot of the example data I uploaded a few days ago.

Example data

3
  • It's always good practice to include some data and expected out put so people can test formulas to make sure they work.
    – gtwebb
    Commented May 4, 2016 at 15:48
  • Help us to help you post several example of both input and desired result. Commented May 4, 2016 at 16:09
  • Hey @Gary's student and gtwebb, would you guys like me to edit the opening post with the example data or add a comment like I'm doing right now? I've tried both solutions but I found the formula to unfortunately not work while the VBA did work (although I get a #value error in about half of the cells) Commented May 9, 2016 at 7:38

3 Answers 3

0

If your string of digits will always be the first set of digits in your string, then you can use the following formula. It is an array formula entered by holding down ctrl+shift while hitting enter:

=MAX(IFERROR(--MID(A1,MIN(FIND({0,1,2,3,4,5,6,7,8,9},A1&"0123456789")),{7,8,9,10}),0))

If there might be shorter or longer substrings of digits prior to the one you wish to extract, then I would use the UDF below. It makes use of regular expressions to find the first digit string that is exactly 7-10 digits long. Since it is returning a string, it should retain any leading zero's.

Use it in a formula such as:

=FirstDigits(A1)

Copy the code below into a Regular Module:

Option Explicit
Function FirstDigits(S As String) As String
    Dim RE As Object, MC As Object
    Const sPat As String = "\b\d{7,10}\b"

Set RE = CreateObject("vbscript.regexp")
With RE
    .Global = True
    .Pattern = sPat
    If .test(S) Then
        Set MC = RE.Execute(S)
        FirstDigits = MC(0)
    Else
        FirstDigits = "No digit string 7-10 digits long"
    End If
End With

End Function
1
  • Thanks a bunch @Ron Rosenfeld , this solution seems to work as intended Commented May 24, 2016 at 7:25
2

Ignore this answer if you receive a "formula-only" answer that meets your requirements.



This small UDF() will return the first number in a string meeting your requirements:

enter image description here

The code:

Public Function GetNumber(sIN As String) As Long
    Dim L As Long, i As Long
    Dim s As String

    s = sIN
    L = Len(s)
    For i = 1 To L
        ch = Mid(s, i, 1)
        If ch Like "[0-9]" Then
        Else
            Mid(s, i, 1) = " "
        End If
    Next i

    With Application.WorksheetFunction
        arr = Split(.Trim(s), " ")
    End With

    For Each a In arr
        If Len(a) > 6 And Len(a) < 11 Then
            GetNumber = CLng(a)
            Exit Function
        End If
    Next a
    GetNumber = 0
End Function

User Defined Functions (UDFs) are very easy to install and use:

  1. ALT-F11 brings up the VBE window
  2. ALT-I ALT-M opens a fresh module
  3. paste the stuff in and close the VBE window

If you save the workbook, the UDF will be saved with it. If you are using a version of Excel later then 2003, you must save the file as .xlsm rather than .xlsx

To remove the UDF:

  1. bring up the VBE window as above
  2. clear the code out
  3. close the VBE window

To use the UDF from Excel:

=getnumber(A1)

To learn more about macros in general, see:

http://www.mvps.org/dmcritchie/excel/getstarted.htm

and

http://msdn.microsoft.com/en-us/library/ee814735(v=office.14).aspx

and for specifics on UDFs, see:

http://www.cpearson.com/excel/WritingFunctionsInVBA.aspx

Macros must be enabled for this to work!

1

Here's a formula I think works.

=TRIM(MID(SUBSTITUTE(A1," ",REPT(" ",LEN(A1))), 
    ((1/MAX(IFERROR(1/(
        ISNUMBER((TRIM(MID(SUBSTITUTE(A1," ",REPT(" ",LEN(A1))), (ROW($1:$25)-1)*LEN(A1)+1, LEN(A1))))*1)*
        (LEN((TRIM(MID(SUBSTITUTE(A1," ",REPT(" ",LEN(A1))), (ROW($1:$25)-1)*LEN(A1)+1, LEN(A1)))))>=7)*
        (LEN((TRIM(MID(SUBSTITUTE(A1," ",REPT(" ",LEN(A1))), (ROW($1:$25)-1)*LEN(A1)+1, LEN(A1)))))<=10)*
        (ROW($1:$25))),-1)))-1)*LEN(A1)+1, LEN(A1)))

To be quite honest I can't explain it once its in this format. It started here which gives a formula to extract the nth word

=TRIM(MID(SUBSTITUTE(A1," ",REPT(" ",LEN(A1))), (N-1)*LEN(A1)+1, LEN(A1)))

Then I had to make it checked all the words so added ROW($1:$25) in place of N (just gives the array 1:25).

Then I had to check if it was a number (isnumber), check the length (the two len>=7 and len<=10 functions, multiply it by the array 1:25.

Then I needed to extract the smallest number not 0 which I did with the 1/max(iferror(1/ formula. If the criteria aren't met the value returned is 0, 1/0 errors out so is assigned -1. The others numbers are 1/N, take the max and then invert again which gives the smallest number not 0. Through that number back in the original equation above to return that word.

Right now it only works for the first 25 words of a string (could extend Row(1:25)). It targets cell A1.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .