One of the most common operations that programmers use on strings is to check whether a string contains some other string.
If you are coming to Python from Java, you might have used the contains method to check if some substring exists in some string.
In Python, there are two ways you can acheive that.
First: Using the in operator
The easiest way is via the python’s in operator.
Let’s take a look at this example.
>>> str = "Messi is the best soccer player" >>> "soccer" in str True >>> "football" in str False
As you can see, the in operator returns True when the substring exists in the string.
Otherwise, it returns false.
This method is very straightforward, clean, readable, and idiomatic.
Second: Using the find method
Another method you can use is the string’s find method.
Unlike the in operator which is evaluated to a boolean value, the find method returns an integer.
This integer is essentially the index of the beginning of the substring if the substring exists, otherwise -1 is returned.
Let’s see the find method in action.
>>> str = "Messi is the best soccer player" >>> str.find("soccer") 18 >>> str.find("Ronaldo") -1 >>> str.find("Messi") 0
One cool thing about this method is you can optionally specify a start index and an end index to limit your search within.
>>> str = "Messi is the best soccer player" >>> str.find("soccer", 5, 25) 18 >>> str.find("Messi", 5, 25) -1
Notice how a -1 was returned for “Messi” because you are limiting your search to the string between indices 5 and 25 only.
Some Advanced Stuff
Assume for a second that Python has no built-in functions or methods that would check if a string contains another string.
How would you write a function to do so?
Well an easy way is to brute force by checking if the substring exists starting from every possible position in the original string.
For larger string, this process can be really slow.
There are better algorithms for string searching.
I highly recommend this article from TopCoder if you want to learn more and dive deeper into string searching algorithms.
For more coverage of other string searching algorithms not covered in the previous article, this wikipedia page is great.
If you go through the previous articles and study them, your next question would be “well what algorithm does Python actually use?”
These kind of questions almost always require digging into the source code.
But you are in luck, because Python’s implementation is open source.
Alright, let’s dig into the code.
Perfect, I am happy the developers commented their code 🙂
You can use the in operator or the string’s find method to check if a string contains another string.
The in operator returns True if the substring exists in the string, otherwise it returns False.
The find method returns the index of the beginning of the substring if found, otherwise -1 is returned.
Python’s implementation (CPython) uses a mix of boyer-moore and horspool for string searching.
If you are a beginner, then I highly recommend this book.
No longer a beginner?
Then you are ready for this book to get to the next level (It’s my favorite).