Daniel Khoo, Software Engineer
ScamShield is a mobile application by Open Government Products to block scam calls and SMS messages. In Part I, we covered how we developed the machine learning model. In Part II, we will discuss implementing call blocking and message filtering on the iOS application.
The ScamShield iOS application has 3 main features: the blocking of scam calls, the filtering of scam SMS messages, and an in-app scam reporting function. While the reporting and onboarding screens were implemented with React Native code, the call blocking and message filtering implementations were written in Swift and are unique to the iOS ecosystem.
iOS Application Extensions
The concept of an app extension is a key to handling external events like Calls and Messages on iOS. So what is an app extension?
Per the Apple documentation, “App extensions let you extend custom functionality and content beyond your app and make it available to users while they’re interacting with other apps or the system.”
Examples of app extensions include Sticker Packs and Custom Keyboards which provide extra features to users while they are in messaging apps. Extensions can run independently from the main application and data isn’t automatically shared between them.
In the case of ScamShield we require two types of app extensions: Call Directory and Message Filtering. Both of these require explicit user permissions to be granted, though users can disable them at any time. Using the app extensions provides a good framework for implementing our intended features while enjoying the Apple’s built in user privacy protections.
For handling contacts and calls, we use the Call Directory extension. This allows us to add known scammer numbers to a blocklist. To protect user privacy, iOS still handles the incoming call with our extension simply providing the list of numbers to be blocked. This means that we are unable to view the number of the incoming caller or even that a call has been received. Updates to the blocklist are pulled regularly via Background App Refresh.
Note: The scammer numbers are from a list maintained and updated by the Singapore Police Force.
The bulk of the development process for ScamShield was the training and deployment of the message filtering model. Unlike most uses of machine learning models on mobile, we aren’t running it when the application is in the foreground. Instead we need to run it in the background when an SMS is received. Apple provides the Message Filter app extension for this use case.
Like the Call Directory, this app extension is designed with user privacy in mind. If the user gives our app permission, future incoming SMS messages from unknown numbers will be sent to the Message Filter app extension to determine if they should be allowed or sent to the junk folder. A number is only considered unknown if you don’t have it saved in your contacts list and you have not previously exchanged messages with it. This means that normal SMS messages with friends and family are not scanned by ScamShield.
As discussed in Part I, the initial trial flow forwarded the messages to the server so that it could be run against our machine learning model.
Apple’s design of the Message Filter extension ensures that only the SMS text and incoming number are forwarded to the server. This makes it impossible for malicious applications to include any identifying information about the user.
To further improve user privacy, the team wanted to run the model locally on device. This would allow ScamShield to minimise the data collected, only forwarding messages already classified as scam to the server.
This is quite an unusual architecture, we are not aware of any existing message filtering application that incorporates an on-device machine learning model. Running models are typically CPU and memory-intensive operations. Our biggest unknown was if we would be able to run the model and how slow it would potentially be as a background app extension.
Fortunately, TensorFlow Lite has extensive documentation and is well supported on iOS as a CocoaPods library. This allowed us to convert our initial server model to the mobile friendly TensorFlow Lite format and run it on the app extension.
However as expected, this ran into performance issues as background applications have strict time and memory limits. In this case we far exceeded the maximum memory size of 6 MB (our model was 100 MB), beyond which iOS would terminate the application, allowing the scam message to pass.
As discussed in Part I, we had to experiment with alternative models that were smaller and able to run with less memory. We finally settled on the average_word_vec model which is just 3.9 MB and runs well within the memory limit. We also opted to run the model single-threaded to reduce memory consumption at the cost of being slightly slower. With some more tweaking we were able to achieve performance similar to that of the original BERT model.
Though our model is retrained and updated over time to handle new types of messages, scammers continue to evolve and we want to be able to respond quickly to new formats of scam messages that may evade the model. For this we add another layer of message filtering where we can compare the incoming message with a list of template scam messages using a metric known as Levenshtein distance. This measures the similarity between 2 sets of text. If an incoming message matches a template, it is classified as a scam without needing to run the model.
Like the list of blocked numbers, our list of template messages can be quickly updated via Background App Refresh. This is powerful as it allows us to respond very quickly to novel scam campaigns. We are now able to protect users within minutes of adding the new scam message to our templates, minimising the window that scammers have to operate.
Where are we now?
As of February 2021 ScamShield iOS has
- Over 84,000 App installs
- Intercepted and filtered over 200,000 scam messages averaging ~3,000 messages per day
- You can run a lean machine learning model to filter incoming messages on iOS
- Apple’s iOS App Extension framework offers powerful APIs while preserving user privacy
Lennard Lim, Product Manager
Ng Wing Yiu, Data Scientist
Aaron Lee, Software Engineer
Seah Chin Ying, Software Engineer
Huang Kaiwen, Software Engineer
Hafizah binte Abu Husin, Designer
Christabel Png, Designer
National Crime Prevention Council (NCPC)
SPF Anti-Scam Centre