Python offers many powerful features and functions to parse strings and extract data, as per your requirement. Sometimes you may need to extract numbers from string in Python. This can a be very tedious exercise, if you try to code it from scratch. There are a couple of python function that can help you with this. In this article, we will learn how to do this in a couple of ways.
How to Extract Numbers from String in Python
Let us say you have the following string which has numbers too.
line = "hello 50 hi 99"
Let us say you want to extract the numbers 50 and 99 out of this string.
1. Using isdigit()
isdigit() function returns true or false on whether a given string is numeric or not. You can easily extract numbers from string using isgiti() and list comprehensions as shown below.
>>> [int(s) for s in line.split() if s.isdigit()] [50, 99]
Let us look at the above code in detail. line.split() will split the string into an array of words & numbers [“hello”,”50″,”hi”,” 99″]. split() function will split the string based on whitespace characters.
The list comprehension will loop through this list one item at a time. For each item, it will check if the item is a number or not, using isdigit() function. If the list item is number then it returns the integer version of the number using int() function.
Please note, the above code will not match numbers like 54 in ‘hi54there’ since there are no spaces before and after it, and so it cannot be parsed as a separate item during split function. Consequently, isdigit() function fails on this string.
2. Using Regular Expressions
You can also use regular expression to extract numbers from string, as shown below.
>>> import re >>> re.findall(r'\d+', line) ['50', '99']
We use re library with findall() function to find all occurrences of substrings that match our regular expression d+ which indicates numbers.
It will also match 54 in string ‘hi54there’.
>>> re.findall(r'\d+', 'hi54there') ['54']
If you only want to match numbers with space before and after, that is, standalone numbers, modify the regular expression as shown below.
>>> re.findall(r'\b\d+\b', line) ['50', '99']
This approach is more flexible than using isdigit() since it allows you to modify regular expression as per your requirements. But please note, re.findall() function will return a list of strings. If you want to return only numbers, you can combine it with list comprehensions as shown below.
>>> [int(s) for s in re.findall(r'\b\d+\b', list)] [50, 99]
In this case, we loop through the list of strings returned by findall() function, and use int() function to convert them into integer.
In this article, we have learnt a couple of simple ways to easily extract numbers from string or text data.
Also read:
How to Concatenate Items in List into String in Python
How to Create Multiline Strings in Python
How to Put Variable in String in Python
How to Fix ValueError in Python
How to Fix ASCII codec can’t encode Error in Python
Related posts:
Sreeram has more than 10 years of experience in web development, Python, Linux, SQL and database programming.