Technical Program

8:00 - 9:00 Registration
All workshop participants must complete their registration at the registration desk in the ICPR Venue (Istanbul Convention & Exhibition Centre) and then walk to the workshop venue (Istanbul Military Museum).

9:00 - 9:10 Opening and Announcement of Special Issue

9:10 - 10:10 Session 1

9:10 - 9:40 R.Ewerth, J.P.Harries, and B.Freisleben. Mobile OCR on the iPhone for Different Types of Text Documents.

9:40 - 10:10 Y.Wang, L.Bai and L.Shen. AR Registration for Video-Based Navigation.

10:10 - 10:40 V.Chandrasekhar, M.Makar, G.Takacs, D.Chen, S.S.Tsai, N.-M.Cheung, R.Grzeszczuk, Y.Reznik, and B.Girod. Survey of SIFT Compression Schemes

10:40 - 11:00 Coffee break

11:00 - 12:00 Session 2

11:00 - 11:30 Z.Yao, Y.Song, and Y.Zhang. An Efficient Seal Detecting Algorithm.

11:30 - 12:00 N.K.Aitha and S.Bandarkar. A Hybrid Multi-Layered Video Encoding Scheme for Mobile Resource-Constrained Devices.

12:00 - 13:30 Lunch break

13:30 - 15:00 Session 3

13:30 - 14:00 T.-H.Tan, C.-S.Chang, Y.-F.Chen, Y.-F.Huang, and T.-Y.Liu. Development of an Intelligent e-Restaurant with Menu Recommendation for Customer-Centric Service.

14:00 - 14:30 Y.-N.Hsu, S.-W.Chien, V.T.-H.Lin, J.Lee, T.-H.Tan, and Y.-F.Chen. Application of RFID Technology in Management of Controlled Drugs- Proof of Concept

14:30 - 15:00 Z.Zhang, Y.Nakashima, and N.Naka. Recognition-Based Error Correction with Text Input Constraint for Mobile Phones.

15:00 - 15:20 Coffee break

15:20 - 16:50 Session 4

15:20 - 15:50 F.A.Kondori, S.Yousefi, and H.Li. Tracking Fingers in 3D Space for Mobile Interaction.

15:50 - 16:20 N.Zandi, R.Handler, J.E.Larsen, and M.K.Petersen. People, Places and Playlists: Modeling Soundscapes in a Mobile Context.

16:20 - 16:50 H.-B.Kang, J.-u.Kim, I.-W.Byun, M.Kim, and S.-Y.Park. Affection-Based Visual Communication in Mobile Environment.

16:50 Closing

Survey of SIFT Compression Schemes
V.Chandrasekhar, M.Makar, G.Takacs, D.Chen, S.S.Tsai, N.-M.Cheung, R.Grzeszczuk, Y.Reznik, and B.Girod
Abstract: Transmission and storage of local feature descriptors are of critical importance for mobile visual search applications. We perform a comprehensive survey of Scale Invariant Feature Transform (SIFT) compression schemes proposed in the literature and compare them in a common framework. Further, we compare the different schemes to the recently proposed low bit-rate Compressed Histogram of Gradients (CHoG) descriptor. We show that CHoG outperforms all SIFT compression schemes.

Back to Technical Program

AR Registration for Video-Based Navigation
Y.Wang, L.Bai and L.Shen.
Abstract: We have developed a novel video based personal navigation system for mobile automotive systems and handheld augmented reality applications, using a combination of computer vision and augmented reality techniques. With this type of navigation systems on PDAs or mobile phones, virtual road signs are superimposed onto the video of the real road scene. Such navigation systems allow the driver to travel to the destination by following the virtual signs in the video, offering a more intuitive and safer navigation solution. In this paper two methods for augmented reality registration of virtual signs onto the road in the video are described. The registration methods involve camera calibration and the pose estimation of either the camera or the reference object in the scene from visual information. Registration results on real road videos are presented.

Back to Technical Program

Mobile OCR on the iPhone for Different Types of Text Documents
R.Ewerth, J.P.Harries, and B.Freisleben
Abstract: Several approaches for solving the problem of optical character recognition (OCR) for machine-printed documents have been proposed in the literature. Typically, these approaches perform well when documents are scanned under controlled illumination and at a resolution of at least 300 dpi, but when documents are captured by digital cameras at lower resolutions and in arbitrary environments, their performance degrades significantly. In this paper, we present an approach for mobile OCR running entirely on Apple.s iPhone. The approach is based on the free OCR software tesseract. Several image processing and enhancement techniques are proposed to deal with arbitrary text and illumination conditions. In contrast to previous work, all processing steps are executed on the mobile phone. Experimental results are reported for a test set of 46 images, improving the accuracy of the baseline system by more than 12% and outperforming two commercial OCR applications for the iPhone.

Back to Technical Program

An Efficient Seal Detecting Algorithm
Z.Yao, Y.Song, and Y.Zhang
Abstract: This paper presents a novel seal detecting algorithm which utilizes statistic shape features. To extract different types of seals from the document image, the candidate regions of seal are located by using the foreground and background connected components analysis, and for rectangle seal, we improve the classic Hough transform method using the run-length histogram features to detect line segments and build a straight line relationship matrix, for circle and ellipse seal, its center is fixed efficiently using run-length information and symmetry, and circle and ellipse seals can be detected by RANSAC fitting. The experimental results show that proposed algorithm could obtain high average recall and precision rates simultaneously.

Back to Technical Program

A Hybrid Multi-Layered Video Encoding Scheme for Mobile Resource-Constrained Devices
N.K.Aitha and S.Bandarkar
Abstract: The use of multimedia-enabled mobile devices such as pocket PC's, smart cell phones and PDAs is increasing by the day and at a rapid pace. Networked environments comprising of these multimedia-enabled mobile devices are typically resource constrained in terms of their battery capacity and available bandwidth. Real-time computer vision applications typically entail the analysis, storage, transmission, and rendering of video data, and are hence resource-intensive. Consequently, it is very important to develop a content-aware video encoding scheme that adapts dynamically to and makes ecient use of the available resources. A Hybrid Multi-Layered Video (HMLV) encoding scheme is proposed which comprises of content-aware, multi-layer encoding of the image texture and motion, and a generative sketch-based representation of the object outlines. Each video layer in the proposed scheme is characterized by a distinct resource consumption prole. Experimental results on real video data show that the proposed scheme is eective for computer vision and multimedia applications in resource-constrained mobile network environments.

Back to Technical Program

Development of an Intelligent e-Restaurant with Menu Recommendation for Customer-Centric Service
T.-H.Tan, C.-S.Chang, Y.-F.Chen, Y.-F.Huang, and T.-Y.Liu
Abstract: Traditional restaurant service is generally passive: waiters must interact with customers directly before processing their orders. However, a high-quality customer-centered service system would actively identify customers and their favorite meals and expenditure records. To achieve this goal, this study integrates radio frequency identification (RFID), wireless local area network (WLAN), database technologies and a menu recommendation subsystem to develop an intelligent e-restaurant for customer-centric service. This system enables waiters to immediately identify customers via RFID-based membership cards and then to actively recommend the most appropriate menus for customers. Experimental results obtained from a case study conducted in a restaurant indicate that the proposed system has practical potential in providing customer-centric service.

Back to Technical Program

Application of RFID Technology in Management of Controlled Drugs- Proof of Concept
Y.-N.Hsu, S.-W.Chien, V.T.-H.Lin, J.Lee, T.-H.Tan, and Y.-F.Chen
Abstract: Due to their potential for habitual use, dependence, abuse, and danger to the society, controlled drugs are tightly regulated in many countries. With the integration of Radio Frequency Identification (RFID) System, controlled drug management can be more accurate and effective. Furthermore, RFID could be the key role for the discrimination between genuine and counterfeit drugs. Hence, how to elevate patient safety is becoming a very important issue for healthcare organizations. Medication safety is the most important topic in improving patient safety. In addition to reducing the error rate of records written manually in the traditional procedures, the application of RFID technology is even more effective to control and monitor legal disposition of controlled drugs and to ensure the accuracy of their pedigrees.

Back to Technical Program

Recognition-Based Error Correction with Text Input Constraint for Mobile Phones
Z.Zhang, Y.Nakashima, and N.Naka
Abstract: We propose a highly practical error correction method for mobile phones. Given a recognition error, we use a re-recognition-based correction method that is constrained by the user.s input of the first few correct characters. It forces the system to produce text that begins with these characters. Since this method refreshes the correction result in response to each character input by the user, error correction is finished with the fewest possible keystrokes. Our experiment shows that this method reduces input stroke quantity by 70%; its correction rate reaches 80%.

Back to Technical Program

Tracking Fingers in 3D Space for Mobile Interaction
F.A.Kondori, S.Yousefi, and H.Li
Abstract: Number of mobile devices such as mobile phones or PDAs has been dramatically increased over the recent years. New mobile devices are equipped with integrated cameras and large displays which make the interaction with device easier and more efficient. Although most of the previous works on interaction between humans and mobile devices are based on 2D touch-screen displays, camera-based interaction opens a new way to manipulate in 3D space behind the device in the camera's field of view. In this paper, our gestural interaction relies heavily on particular patterns from local orientation of the image called Rotational Symmetries. This approach is based on finding the most suitable pattern from the large set of rotational symmetries of different orders which ensures a reliable detector for fingertips and human gesture. Consequently, gesture detection and tracking can be used as an efficient tool for 3D manipulation in various applications in computer vision and augmented reality.

Back to Technical Program

People, Places and Playlists: Modeling Soundscapes in a Mobile Context
N.Zandi, R.Handler, J.E.Larsen, and M.K.Petersen
Abstract: In this paper we present an initial study of music listening patterns on mobile devices combined with contextual information. The study included N=7 participants that carried a smart phone for a duration of two weeks. The participants used the main features of the phone along the music player capabilities. All phone activities and data from embedded sensors were recorded along the music being played on the device. We report initial indications that listening patterns in terms of music genre preferences are influenced by whether the user is in a static environment or on the move. Applying a simple decision tree algorithm to identify what contexts determine the preferences indicate that our listening patterns change over time, suggesting that music applications utilizing context information must be designed to adapt to our shifting preferences as they continuously evolve.

Back to Technical Program

Affection-Based Visual Communication in Mobile Environment
H.-B.Kang, J.-u.Kim, I.-W.Byun, M.Kim, and S.-Y.Park
Abstract: In this paper, we propose a new visual communication method in the short messaging system. Particularly, in Twitter, text-based posts in the limit of 140 characters are not efficient, though useful some times, to clearly express the author.s message to the followers. To complement this shortcoming, we suggest to post an appropriate image along with/without a text message. To generate an image consisting of characters, objects and background from the text message, the keyword detector detects the key word and retrieves the thumbnail images according to the keyword. The author can select appropriate images for background, characters and objects. To represent affection in the images, we design affection-based re-coloring method using interactive genetic algorithm. Our visual communication method is implemented on the i-phone and can post a tweet using an image. The survey result shows that our method is very favorable in posting a message, better than the text message without it.

Back to Technical Program