I carried out a fixed analysis of DeepSeek, a Chinese LLM chatbot, using variation 1.8.0 from the Google Play Store. The objective was to recognize possible security and personal privacy issues.
I have actually blogged about DeepSeek formerly here.
Additional security and privacy concerns about DeepSeek have actually been raised.
See also this analysis by NowSecure of the iPhone variation of DeepSeek
The findings detailed in this report are based purely on fixed analysis. This suggests that while the code exists within the app, there is no definitive evidence that all of it is carried out in practice. Nonetheless, the existence of such code warrants scrutiny, particularly offered the growing issues around data privacy, surveillance, the possible misuse of AI-driven applications, and cyber-espionage characteristics in between global powers.
Key Findings
Suspicious Data Handling & Exfiltration
- Hardcoded URLs direct data to external servers, raising concerns about user activity monitoring, such as to ByteDance "volce.com" endpoints. NowSecure determines these in the iPhone app the other day as well.
- Bespoke file encryption and data obfuscation approaches are present, with indications that they might be used to exfiltrate user details.
- The app contains hard-coded public keys, instead of depending on the user device's chain of trust.
- UI interaction tracking captures detailed user habits without clear approval.
- WebView control is present, which could enable for the app to gain access to private external web browser information when links are opened. More details about WebView manipulations is here
Device Fingerprinting & Tracking
A significant portion of the examined code appears to concentrate on event device-specific details, which can be utilized for tracking and fingerprinting.
- The app collects numerous distinct device identifiers, consisting of UDID, Android ID, IMEI, IMSI, and provider details. - System properties, installed bundles, and root detection systems recommend potential anti-tampering procedures. E.g. probes for the presence of Magisk, a tool that privacy advocates and security scientists utilize to root their Android devices. - Geolocation and network profiling are present, setiathome.berkeley.edu suggesting potential tracking capabilities and allowing or disabling of fingerprinting regimes by area. - Hardcoded device design lists suggest the application might act in a different way depending upon the found hardware.
- Multiple vendor-specific services are utilized to draw out extra gadget details. E.g. if it can not identify the device through basic Android SIM lookup (due to the fact that approval was not given), it tries manufacturer particular extensions to access the very same details.
Potential Malware-Like Behavior
While no conclusive conclusions can be drawn without dynamic analysis, numerous observed habits line up with known spyware and malware patterns:
- The app uses reflection and UI overlays, which could facilitate unauthorized screen capture or phishing attacks. - SIM card details, serial numbers, and other device-specific data are aggregated for unidentified purposes.
- The app implements country-based gain access to constraints and "risk-device" detection, recommending possible monitoring systems.
- The out calls to load Dex modules, where extra code is packed from files with a.so extension at runtime.
- The.so submits themselves turn around and make additional calls to dlopen(), which can be utilized to fill additional.so files. This facility is not normally inspected by Google Play Protect and other fixed analysis services.
- The.so files can be carried out in native code, such as C++. The usage of native code includes a layer of complexity to the analysis process and obscures the complete level of the app's abilities. Moreover, native code can be leveraged to more quickly escalate advantages, possibly exploiting vulnerabilities within the operating system or gadget hardware.
Remarks
While data collection prevails in modern applications for debugging and improving user experience, aggressive fingerprinting raises considerable personal privacy concerns. The DeepSeek app needs users to visit with a legitimate email, which need to already offer adequate authentication. There is no legitimate reason for the app to aggressively gather and transfer special gadget identifiers, IMEI numbers, SIM card details, and other non-resettable system properties.
The extent of tracking observed here exceeds typical analytics practices, potentially enabling persistent user tracking and re-identification throughout devices. These behaviors, integrated with obfuscation strategies and network interaction with third-party tracking services, call for a higher level of scrutiny from security scientists and users alike.
The employment of runtime code filling in addition to the bundling of native code recommends that the app might allow the implementation and execution of unreviewed, remotely provided code. This is a major potential attack vector. No evidence in this report exists that remotely released code execution is being done, just that the center for this appears present.
Additionally, the app's approach to finding rooted devices appears extreme for an AI chatbot. Root detection is frequently justified in DRM-protected streaming services, where security and material defense are critical, or in competitive video games to avoid unfaithful. However, there is no clear rationale for such strict steps in an application of this nature, raising additional concerns about its intent.
Users and companies considering setting up DeepSeek ought to understand these potential dangers. If this application is being used within a business or federal government environment, additional vetting and security controls ought to be implemented before allowing its deployment on handled gadgets.
Disclaimer: The analysis presented in this report is based upon fixed code review and does not suggest that all identified functions are actively utilized. Further investigation is needed for conclusive conclusions.