A few days ago, I open sourced AndroidDeviceNames, a tiny Android library that transforms the device model into a user-friendly name.
Samsung Galaxy Note 4 with a simple method call:
Currently, the library recognises about 400 devices. In case the device is not on the list, the above example returns
The idea for the library came while working on a side project where I need to show the device name to the user. Obviously, showing
SM-N910W8 is subpar. Searching for a solution, all I could find online was a somewhat obsolete project with a single
.properties file containing
model = name pairs. I guess the idea is to load the file into a Properties object.
I don’t particularly like this. My side project manipulates bitmaps regularly and I’d like to avoid any unnecessary memory overhead. Since
Hashtable, overhead is expected. Here’s the hprof dump:
It finds the model name in
~O(1) time but takes a whopping
59 kB of memory to do it. In grand scheme of things, this is no problem for Android, but it can be improved. For instance, we could read the file, line by line and:
- discard the pair and continue, if it’s not what we’re looking for, or
- save it to a database, or
- remember the pair in a data structure with hopefully less memory overhead than
Option 1. will only allocate a constant chunk of memory, i.e. the buffer size, but there’s a high penalty to pay if we plan to search often and with different models, as we have to read the file over and over.
Option 2. is good, but messy. I’d have to be smart about database opening and closing, there’s suddenly more code involved, I might clutter my API, etc. Hopefully I can avoid it.
String perform better than
Properties? Investigating option 3. produces the following:
|Data Structure||Retained Heap (kB)||Search time|
Besides a nice insight in memory vs. time tradeoff, there’s still quite a bit of memory involved. It does seem an
ArrayMap represents the middle ground between memory and performance, but watch this before jumping to conclusions.
Can things be improved if we don’t need to be as flexible? For example, if the
model-name pairs are embedded in a class as constants and reading a file is not required? Well, it makes quite an impact:
|Data Structure||Retained Heap (kB)||Search time||Factor|
If our goal is to save memory, we should definitely avoid reading files. The factor column shows that, for example,
String implementation with constants consumes 17x less memory than its file-reading counterpart.
So how do I effectively move the
model-name pairs from a file to Java constants, because I’m sure as hell not writing them by hand?
It took me a few minutes to write a simple Python script that reads the
model-name file and creates a populated Java data structure for me to copy and paste. Sure, more involved than plain Java, but straightforward.
Sacrificing a bit of flexibility for a memory efficient implementation, I could stop here. But would it be possible to eliminate memory overhead completely? This was more of an exploration question, born not out of necessity, but rather curiosity.
I’d have to find a way to eliminate the data structure holding the
model-name pairs. A naive but workable approach could be the following:
Ugly, but with no memory overhead and an okay search performance of
I have to be careful though. When a
.class file is loaded, it resides in the method area of a JVM. The method area is logically part of the heap, so it’s futile trying to avoid overhead if the bytecode occupies the same amount of heap (or more) than a data structure would.
A neat way to look at bytecode is
javap. For example, on
Java(TM) SE Runtime Environment (build 1.7.0_79-b15), compiling
javap -c Hello produces
Each bytecode instruction consists of 1 byte, plus a fixed number of operand bytes.
-c flag ensures the output includes byte offsets for each instruction. Therefore, the method size is roughly the same as the last byte offset. Above,
main method size is 8 bytes, as this is the offset at the last instruction:
Extending the Python script, I now generate a simple Java method,
getDeviceName, which includes the
if-else comparisons. I need to ensure that:
- the generated method size
mGis smaller then the method size
mDquerying a data structure plus the total data structure memory,
mGsize is less than 65536 bytes
Benchmarking against the least consuming data structure,
||bytecode||75 + 6233||3416||9724|
The generated file is a clear winner and passes the two requirements with flying colors. The only thing left to worry about is the search time. Being
O(n), it might prove to be unsatisfactory.
A quick and simple optimization we can do right away is simple branching: taking the first model letter and compare it only against the models with the same starting letter. We’re still stuck in
O(n), but for example, instead of 361 comparisons to get to
SPH-D600, we now only do 61, which is about 6 times less.
Empirically, I’d like to see the worst case run below 16 milliseconds. And playing around with Traceview, the average running time is well below 5 milliseconds, which makes me at ease using the generated
.java file in my side project.