improvement for real data that has a larger and more
imbalanced tree structure, as in the other experiments.
We have demonstrated that the classification algo-
rithm proposed by Needell, Saab, and Woolf (2017)
can be readily adapted to classify data in a hierar-
chical way that improves computational efficiency.
We achieve this by using fewer levels to classify
data points predicted to be from classes that are more
readily identifiable. We could potentially further re-
duce computational costs for easier-to-classify data by
reducing the number of measurements m in those
cases as well. Theoretical guarantees as well as mod-
ifications that alleviate error propagation down the
tree are important directions for future work.
The authors were partially supported by NSF CAREER
Grant 1348721 and NSF BIGDATA Grant 1740325.
Aziz, P. M.; Sorensen, H. V.; and Van der Spiegel, J. 1996. An
Overview of Sigma-Delta Converters. IEEE Signal Processing
Magazine 13( 1): 61–84. doi.org/10.1109/79.482138
Bottou, L., and Bousquet, O. 2011. The Tradeoffs of Large-Scale Learning. In Optimization for Machine Learning, edited by
S. Sra, S. Nowozin, and S. J. Wright, 351–68. Cambridge, MA:
Cheong, S.; Oh, S. H.; and Lee, S.-Y. 2004. Support Vector
Machines with Binary Tree Architecture for Multi-class
Classification. Neural Information Processing-Letters and Reviews
2( 3): 47–51.
Duncan, D., and Strohmer T. 2016. Classification of Alzheimer’s Disease Using Unsupervised Diffusion Component
Analysis. Mathematical Biosciences and Engineering 13( 6):
Fang, J.; Shen, Y.; Li, H.; and Ren, Z. 2014. Sparse
Signal Recovery from One-Bit Quantized Data: An Iterative
Reweighted Algorithm. Signal Processing 102: 201–6. doi.
Godbole, S.; Sarawagi, S.; and Chakrabarti, S. 2002. Scaling
Multi-Class Support Vector Machines Using Inter-Class
Confusion. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,
513–8. New York: Association for Computing Machinery.
Gordon, A. D. 1987. A Review of Hierarchical Classification.
Journal of the Royal Statistical Society. Series A (General) 150( 2):
Griffin, G., and Perona P. 2008. Learning and Using Tax-onomies for Fast Visual Categorization. In IEEE Conference on
Computer Vision and Pattern Recognition, 1–8. New York: IEEE.
Gupta, A.; Nowak, R.; and Recht, B. 2010. Sample Complexity
for 1-bit Compressed Sensing and Sparse Classification. In
International Symposium on Information Theory. New York:
Higdon, R.; Foster, N. L.; Koeppe, R. A.; DeCarli, C. S.; Jagust,
W. J.; Clark, C. M.; Barbas, N. R.; Arnold, S. E.; Turner, R. S.;
and Heidebrink, J. L.. 2004. A Comparison of Classification
Methods for Differentiating Fronto-Temporal Dementia
from Alzheimer’s Disease Using FDG-PET Imaging. Statistics
in Medicine 23( 2): 315–26. doi.org/10.1002/sim.1719
Jacques, L.; Laska, J. N.; Boufounos, P. T.; and Baraniuk R. G.
2013. Robust 1-Bit Compressive Sensing via Binary Stable
Embeddings of Sparse Vectors. IEEE Transactions on Information
Theory 59( 4): 2082–102. doi.org/10.1109/TIT.2012.2234823
Li, T.; Zhu, S.; and Ogihara, M. 2007. Hierarchical Document
Classification Using Automatically Generated Hierarchy.
Journal of Intelligent Information Systems 29( 2): 211–30. doi.
Molitor, D., and Needell, D. 2018. A Simple Approach to
Hierarchical Classification. In Proceedings of the International
Traveling Workshop on Interactions between Low-Capacity Data
Models and Sensing Techniques (i TWIST). arXiv preprint. arxiv:
1812.00648 [ cs.CL]. Ithaca, NY: Cornell University Library.
Needell, D.; Saab, R.; and Woolf, T. 2018. Simple Classification
Using Binary Data. Journal of Machine Learning Research 19( 61):
Silla, C. N., and Freitas, A. A. 2011. A Survey of Hierarchical
Classification across Different Application Domains. Data
Mining and Knowledge Discovery 22( 1-2): 31–72. doi.org/10.
Silva-Palacios, D.; Ferri, C.; and Ramírez-Quintana, M. J.
2017. Improving Performance of Multiclass Classification
by Inducing Class Hierarchies. Procedia Computer Science
108: 1692–701. doi.org/10.1016/j.procs.2017.05.218
Weston, J., and Watkins, C. 1998. Multi-Class Support Vector
Machines. Technical Report CSD-TR-98-04. Egham, UK: Royal
Holloway, University of London, Department of Computer Science.
Zupan, B.; Bohanec, M.; Demšar, J.; and Bratko, I. 1999. Learning
by Discovering Concept Hierarchies. Artificial Intelligence
109( 1-2): 211–42. doi.org/10.1016/S0004-3702(99)00008-9
Denali Molitor is a PhD candidate in the Department of
Mathematics at the University of California, Los Angeles. She
is broadly interested in developing and analyzing machine
learning algorithms and has recently been studying sketch
and project methods for solving large linear systems, data
completion for structured data, and classification methods
using binary data.
Deanna Needell is a full professor in the Department of
Mathematics at the University of California, Los Angeles. Her
research interests include compressed sending, randomized
algorithms, functional analysis, computational mathematics,
probability, and statistics. She earned her PhD in mathematics
from the University of California, Davis, in 2009 before
working as a postdoctoral fellow at Stanford University, then
as an assistant and later associate professor in the Mathematics
Department at Claremont McKenna College in Claremont,
California. She has earned many awards, including the IEEE
Best Young Author award, an Alfred P. Sloan fellowship, NSF
CAREER and NSF BIGDATA awards, and the IMA Prize in
Applied Mathematics. She is associate editor of IEEE Signal
Processing Letters, Linear Algebra and Its Applications, SIAM Journal
on Imaging Sciences, and Transactions in Mathematics and Its Applications and serves on the organizing committees for sessions of
the Society for Industrial and Applied Mathematics and the
Association for Women in Mathematics.