Shipeng Yu
Staff Scientist
Siemens Medical Solutions USA, Inc.
CONTACT INFORMATION
Tel: +1-610-448-4420
(O)
Fax: +1-610-448-4274 (O)
URL: http://www.dbs.ifi.lmu.de/~spyu/
Email: shipeng.yu@siemens.com
(work), shipeng.yu@gmail.com
(private)
RESEARCH INTERESTS
²
Statistical
Machine Learning, Probabilistic Modeling, Bayesian Data Analysis
²
Bayesian
Clustering, Dimensionality Reduction, Supervised Projection
²
Document
Modeling, Collaborative Filtering, Ranking
²
Information
Retrieval and Extraction, Text Mining, Web Search and Mining
EDUCATION
Ph.D. in
Institute for Computer Science, University of
Supervisor: Prof. Dr.
Hans-Peter Kriegel, Dr. Volker Tresp
M.Sc. in Information
Science, School of Mathematical Sciences,
Supervisor:
Prof. Dr. Zuoquan Lin
GPA: 91.6/100 Rank:
2/70
B.Sc. in Information
Science, School of Mathematical Sciences,
GPA: 93.5/100 for major, 91.7/100
for all Rank:
2/41
RESEARCH EXPERIENCE
11/2006 – Present: Staff
Scientist at Siemens Medical Solutions USA, Inc.
Directions: statistical
machine learning for medical data mining
9/2003 – 9/2006: Guest
Research Scientist at Siemens Corporate Technology,
Directions: statistical
machine learning, data mining and information retrieval
²
Proposed
a Dirichlet enhanced latent semantic analysis for documents
²
Introduced
a variational Bayesian method for infinite mixture of exponential family
distributions
²
Co-invented
a supervised projection model and its probabilistic extensions and
interpretations
²
Co-invented
a soft clustering algorithm on graph with applications on semi-supervised
learning
²
Introduced
a multi-task Gaussian Process model for collaborative ranking problem
²
Proposed
a supervised and a semi-supervised PCA which can be applied to very large data
sets
7/2005 - 9/2005: Research
Internship at Microsoft Research
Topic: Bayesian web
ranking with TrueSkillTM System
²
Addressed
web ranking problem by extending TrueSkillTM
system for games
²
Considered
time-independent ranking and feature-dependent ranking for web ranking scenario
²
Implemented
the whole algorithm to handle millions of document-query pairs
²
Proposed
a probabilistic method to evaluate judges and measure query difficulties
9/2001 - 8/2003: Visiting
Student at Microsoft Research
Directions: web page segmentation and applications on web
information retrieval, extraction and mining
²
Proposed
a novel vision-based content structuring method for web page
²
Verified
that vision-based page segmentation (VIPS) could significantly improve web IR
performance
²
Proposed
a combined page segmentation approach which achieved the best performance
²
Introduced
a new strategy for automatic web information extraction based on web content
structure
1/2000 - 7/2003: Research
Assistant at
Directions: web-based DRP systems with security issues
²
Team
Member of WebDaemon,
a role-based access control (RBAC) system on the web
²
Project
Leader of TYGXC, an Internet-based
DRP system for TianYin Corporation,
²
Team
Member of TCLJXC, an Internet-based
DRP system for TCL Corporation,
SELECTED PUBLICATIONS
²
Shipeng
Yu, Balaji Krishnapuram, Romer Rosales, Harald Steck, and R. Bharat Rao. Bayesian
Multi-View Learning. To appear as spotlight in the Advances in Neural
Information Processing Systems 20 (NIPS'2007), 2007.
²
Jianwu
Xu, Shipeng Yu, Jinbo Bi, Lucian Lita, Stefan Niculescu, and R. Bharat Rao. Automatic
Medical Coding of Patient Records via Weighted Ridge Regression. To appear
in the 6th International Conference on
Machine Learning and Applications (ICMLA'2007), 2007.
²
Zhao Xu, Volker Tresp, Shipeng Yu,
²
Shipeng Yu, Volker Tresp, and
²
Mingrui Wu,
²
²
Shipeng Yu,
²
Shipeng
Yu,
²
Shipeng
Yu,
²
Shipeng
Yu,
²
Shipeng
Yu,
²
²
Yi
Huang,
²
Shipeng
Yu,
²
²
Zhao
Xu, Volker Tresp,
²
²
²
²
²
²
²
²
²
Shipeng
Yu,
TALKS
²
Supervised Dimensionality Reduction
with Principal Component Analysis.
Microsoft Research
²
Learning Non-parametric Priors from
Multiple Tasks. OPEN HOUSE
on Multi-Task and Complex Outputs
²
Bayesian Web Ranking for Information
Retrieval. Microsoft
Research
²
Collaborative Ordinal Regression. NIPS’05 Workshop on Learning to Rank, Dec. 2005.
²
A Probabilistic Clustering-Projection
Model for Discrete Data.
The 9th European Conference on Principles and Practice of Knowledge Discovery
in Databases,
²
TrueSkillTM
Hits the Web: Bayesian Ranking Methods for Web Search. Intern Talk, Microsoft Research
²
Multi-Label Informed Latent Semantic
Indexing. The 28th
Annual International ACM SIGIR Conference,
²
Dirichlet Enhanced Latent Semantic
Analysis. Lernen,
Wissensentdeckung und Adaptivität (LWA’04),
Humboldt University, Berlin, Germany, Oct. 2004.
²
Enhancing Web Information Retrieval Using
Web Page Segmentation.
Siemens Corporate Technology,
²
Enhancing Web Information Retrieval Using
Web Page Segmentation.
Institute for Informatics,
TEACHING EXPERIENCE
9/2000 - 1/2002: Lecturer
in
²
Assembly Language, Autumn 2000 and Autumn 2001
9/2001 - 1/2003: Teaching
Assistant in School of Mathematical Sciences,
²
Introduction to Database Systems, Autumn 2002
²
MATLAB System, Spring 2002
²
Statistical Computation, Autumn 2001
PATENTS
²
Collaborative
Ordinal Regression. Filed November 2005, Siemens Corporate Technology.
²
Soft
Clustering on Graphs. Filed October 2005, Siemens Corporate Technology.
²
Multi-Output
Regularized Projection. Filed June 2005, Siemens Corporate Technology.
²
Dirichlet
Enhanced Latent Semantic Analysis. Filed October 2004, Siemens Corporate
Technology.
²
Vision-Based
Document Segmentation. Filed July 2003, Microsoft Research
AWARDS AND HONORS
²
Student
Travel Grant for KDD’2006
²
Student
Travel Grant for ICML’2006
²
Best
Paper Runner-Up & Best Student Paper Runner-Up Award in PKDD’2005
²
Student
Travel Grant for SIGIR’2005
²
Student
Travel Grant for ICML’2005
²
Attendee
of Machine Learning Summer School (MLSS’2004),
²
Student
Travel Grant for SIGIR’2004
²
Siemens
Doctoral Scholarship, Sep. 2003 - Aug. 2006
²
Outstanding
Undergraduate Student of
²
Outstanding
Undergraduate Student of
²
First
Prize in National Mathematical Modeling Competition, 1999
²
Award
for Excellent Student of Peking University, in three successive years 1997 - 1999
²
Scholarships,
in three successive years 1997 - 1999
²
Rank
3 (4 is the best) in National Computer Rank Examination Certificate, 1998
²
First
Prize in National Mathematics Competition for High School Students, 1995
COMPUTER SKILLS
Languages mastered: C, C++, Java, MATLAB, .NET, Perl,
Maple
Systems experienced: MS Windows, Linux
OTHERS
²
Fluent in Chinese and English. Read German and French
²
Master accordion. Degree 5 (out of 10) in the National
Accordion Certificate
²
Enjoy badminton, table tennis, swimming, traveling and reading
REFERENCES
Available upon request.