On October 24, vivo held its own 2022 imaging strategy conference. From the perspectives of hardware imaging, visual processing, and post-beautification, a more complete “blueprint” has been planned for subsequent imaging technologies and products.
In the past few years, with the increasingly fierce competition in the smartphone market and the serious homogeneity of products, various manufacturers have begun to invest heavily in the research and development of imaging technology, hoping to make breakthroughs in the function of photography. Prior to this, vivo has repeatedly shared imaging technology as the theme of the conference.
Great investment has yielded results. The cameras on smartphones today are already a very complex system, even more complex than many professional cameras, with deeper technical depth.
But the physical limitations of mobile phones are still there. The volume of the photosensitive element and the weight of the lens module all limit the performance of the mobile phone camera. When users need to “receive goods with their eyes”, many mobile phones perform unsatisfactorily. After the technology is implemented into the product, it is often misaligned with the needs of users, and the experience improvement is limited. Even sometimes, the technology is iterated, the product is upgraded, and the experience is reversed.
This is not just a challenge that vivo is facing, but also requires the entire industry upstream and downstream to rethink and start again on the “photography needs” of users.
Unbreakable imaging technology
For many years, there has been a popular saying in the photography circle, “The bottom level crushes people to death”.
The meaning is very simple. For digital cameras, the area of the photosensitive element at the bottom determines the amount of light you can capture, which in turn determines the image quality. The larger the “bottom”, the better the effect.
According to the size of the photosensitive element, digital cameras are divided into different levels such as full-frame, APS-C, 1-inch and so on. The size of the sensor determines the image quality, which is the bottom “law” of the photography circle.
For mobile phones, the interior space is so expensive that it is impossible to adopt a camera-like design. In recent years, the largest mobile phone camera module with photosensitive elements has only reached the level of “1-inch bottom”, and it can only be said to be “small bottom” in the field of cameras.
At the same time, the camera scene that the mobile phone has to deal with is more complicated. Ordinary users will not increase the amount of light entering by artificial lighting, extending the exposure time with a tripod, etc., which aggravates the problem of insufficient light intake and puts forward higher requirements for the imaging system.
In order to solve the image quality problem of mobile phone photography, the project has come up with many methods, from the basic “hardware stacking”, increasing the pixel and area of the photosensitive element, and redesigning the photosensitive sub-pixel; to “computational photography”, through many Exposure, reduce image noise at high sensitivities, increase brightness, sharpen the edges of blurred objects…
Especially in recent years, the “night scene mode” that various manufacturers have focused on, many of them “restore” the picture through machine learning algorithms. It allows the phone to take a decent photo even in extremely dark, almost pitch-dark environments, with higher brightness than a professional camera.
The night scene mode of most mobile phones will brighten the screen to a very high level|The Verge
However, these improvements have also encountered many problems.
In terms of hardware, whether it is to increase the sensor area or increase the pixels, the improvement in photosensitive performance is actually relatively limited. Especially in low light scenes, it is difficult to substantially improve the problem of insufficient light input.
However, relying on software optimization to improve the picture quality through “calculation”, there are still problems. The computational photography functions developed by many manufacturers are too rough in processing. Some of them will over-sharpen the picture to highlight the flaws on the face, and sometimes they will excessively increase the brightness of the dark part of the picture, making the picture appear flat. The lack of light and dark contrast has been criticized by users as “plastic feeling”.
If you simply use “brightness” and “clearness” as the evaluation criteria, the image performance of smartphones has indeed improved significantly in recent years, and the image quality in low light has been greatly improved. But in the eyes of users, the standard for evaluating a photo as “good” is obviously not just bright and clear. “How to take good-looking photos” is a more complicated issue.
Including some featured image functions, there is also the problem of technology and experience dislocation. The portrait mode is unclean and unnatural; the availability of movie mode, sports mode, and high-pixel mode is not high… and many manufacturers have introduced the function of shooting RAW format photos, which can indeed improve the quality of photos and give users More room for post-processing. However, in many cases, users are caught in the contradictory situation of “unsightly by default, troublesome RAW” because the post-operation is too cumbersome.
These problems are concentrated in the most mainstream flagship mobile phone, the iPhone. “The original camera is ugly” has become the general consensus of iPhone users. Almost every iPhone user will install more than one camera software in the mobile phone. Some are specially used for taking selfies, adding retro filters to photos, and specializing in photo retouching. It’s all about complementing the flaws of the original camera.
There are even some iPhone users who use the screenshot function to take a screenshot of the preview image in the iPhone’s viewfinder when taking a photo, instead of taking a self-portrait with the shutter. Because the picture in the viewfinder is not processed by the computational photography process, the blemishes on the face are not sharpened, and the shadows are not forcibly brightened, so it looks more natural.
The misalignment of functions and requirements has turned manufacturers’ efforts in the field of imaging technology into “negative optimization.”
Reinventing the “ruler” of moving images
There was no disagreement about the criteria for “good” moving images.
A good imaging system must first have powerful hardware, provide sufficient picture clarity, cover multiple focal lengths, have good latitude, HDR effect, and accurate color reproduction; secondly, it should achieve a smooth experience, camera software The design is reasonable, the functions are easy to understand, and the photo-taking rate is high; finally, there must be rich and intelligent post-processing functions, which can identify scenes, make reasonable and natural computational photography optimization, and at the same time leave room for users to post-processing.
But in practice, various contradictions began to emerge.
For example, stacking hardware will increase the weight and thickness of the mobile phone, and the camera will be too prominent, which many users are unwilling to accept; after increasing the pixel, the volume of the photo file will increase exponentially, and the processing time after taking a photo will also be longer.
Another example is the “selfie problem” mentioned above. In the portrait shooting scene, most domestic brands, as well as Samsung, will soften the facial details and add a certain “skin treatment” to meet the needs of Asians. However, this processing method is not in line with the aesthetics of Westerners. They are more inclined to show the details of facial skin, so the processing method of iPhone, Google Pixel and other mobile phones is more inclined to “sharpening”.
Many contradictions are stacked together, which puts forward higher requirements for the imaging department of mobile phone manufacturers. Each company not only needs to break the restrictions through self-developed technology, but also needs to establish a deeper understanding of user needs and preferences.
This is fully reflected in the development process of vivo’s imaging technology, from lens optical system, CMOS sensor, to ISP chip, computational photography algorithm… vivo has injected resources into almost every link of mobile imaging, and launched the same technology in the underlying technology. The optical system jointly developed by Zeiss also uses the self-developed ISP chip V1.
The next-generation self-developed ISP chip released by vivo at the impact strategy conference|vivo
At this imaging strategy conference, vivo announced a series of progress in underlying technologies. Including CMOS that can increase the sensor’s light sensitivity by 77%, AI-ISP that further utilizes machine learning performance to improve image processing speed, and the next-generation miniaturized lens module jointly developed with Zeiss. The core goal of moving influence has never changed: in the case of limited module size, increase the amount of light as much as possible and increase the computing power.
In addition to hard work, choice is also very important. In the past few years, while vivo has invested heavily in the underlying imaging technology, it has not chosen to increase the parameters too much. A relatively conservative configuration is adopted in parameters such as the number of pixels and zoom factor. At the same time, vivo chose to make a series of technologies that can reflect its “unique understanding”.
For example, “Micro-Cloud Platform” focuses on enhancing the anti-shake function, while increasing the amount of light entering, it can also optimize the user’s photographing experience and improve the filming rate. Another example is the lens coating jointly developed with Zeiss, which solves the ghosting problem in the light reflection process that plagues many manufacturers.
In recent years, it has helped vivo to obtain the micro-gimbal technology favored by the market|vivo
These choices have also been responded and recognized by the market. According to market research firm Counterpoint, in 2021, vivo will become the number one smartphone brand in China, with a market share of 22%. In the second quarter of this year, vivo’s market share in the domestic high-end smartphone market also achieved a breakthrough, becoming second only to Apple.
In the past few years, in the industry dilemma that technology can not necessarily be transformed into user experience, vivo has achieved several successful “direction exploration”, refreshed its understanding of moving images, and reshaped the ruler.
At this year’s impact strategy conference, vivo announced that it will build a future-oriented “image technology matrix” from multiple dimensions of CMOS, super-resolution algorithm, next-generation AI-ISP and computational photography, and make further progress.
Empower ordinary people with “a shutter”
In the early days of the development of mobile phone imaging, there was a discussion in the circle of digital enthusiasts: when taking pictures with mobile phones, we should try to brighten the picture to make it easier for users to take pictures of what they want to take; or we should accurately expose, try to ensure the purity of the picture, and make the picture as good as possible. better looking?
This discussion of “brightness or perception” laid the groundwork for the development of mobile images in the next decade.
For mobile phone users, taking pictures seems to be very simple, just point at the object and press the shutter. But in fact, professional photography is far from simple. From the selection of equipment before shooting, the layout of the scene, to the composition and exposure adjustment during shooting, to the color correction and image retouching after shooting, each link is a professional discipline.
Therefore, for mobile phone manufacturers, the biggest problem in mobile imaging technology is how to condense these highly professional and complex links into a simple “shutter button”. In this process, there are a lot of contradictions similar to “brightness or look and feel”.
This is why, in its own imaging strategy, vivo proposes “three shoulders” to bring the technical level and product experience of mobile imaging to a level that can be compared with professional imaging equipment, professional photography teams, and professional post-production capabilities. .
At the press conference, vivo invited professional photography artists to interpret vivo’s concept and method of processing images.
One of the most unique, and one that other manufacturers rarely mention, is the importance of “tone”. Today, most mobile phones take pictures, and in the scene of high light ratio, the HDR function will be called to increase the brightness of the dark part and reduce the highlight part to show as much picture details as possible.
But the way the human eye observes the world is not like this. The world that people see is a “combination of light and shadow”, with light and dark, in order to reflect the three-dimensional and layered sense of the picture. Therefore, vivo adopts an optical perception system, coupled with the recognition of the environment, and adds the process of “light and shadow reconstruction” to its computational photography pipeline, which can more accurately restore the tone of the picture, which not only ensures the accurate exposure, but also makes the picture more accurate. It will not just be “silly bright”, it will appear to have no layers.
To achieve this, it is necessary to have powerful hardware as the foundation, CMOS with sufficient latitude, and ISP with sufficient computing power for scene recognition. At the same time, it is necessary to have a deep understanding of the shooting needs of users, and finally use professional photography works. , as a training set to enhance the “aesthetic ability” of the imaging system.
Through this series of processes, it finally fell to the function, which became the “smart white plus black minus” function on the X80 series mobile phones. And this function is just the tip of the iceberg of vivo’s intelligent optimization of the whole scene. In addition to the tone, they really optimized the color, night scene, portrait, video and other scenes.
At this press conference, vivo specially showcased its latest breakthroughs in multiple algorithm fields.
Among them, the “optical super-resolution algorithm” used to improve the resolution of the picture can greatly improve the clarity of the picture at the ultra-telephoto end above 5X, making the ultra-telephoto no longer “chicken”. It also includes the “Super Sensitive Portrait System”, which can understand the portrait information, and then optimize the details. It is not a rough “grinding and smearing”. It retains details without highlighting flaws, creating a natural and unique portrait atmosphere.
The last “big trick” is the “VCS release spectrum technology” that can control dark light noise and color reproduction. Vivo said that it can increase the image signal-to-noise ratio by 20% and the restoration by 15%. The “Sky Night Scene System” added with this technology can be based on CMOS with stronger photosensitive ability, coupled with AI algorithm, to increase the night scene photosensitive ability by 100%, and realize “handheld shooting of starry sky”.
From self-developed chips and algorithms, to joint development of optical systems and lenses with Zeiss, to next-generation CMOS… Today, to build an excellent mobile imaging system, it is not only necessary to invest resources in a single technical point to break through, but also from the The entire technology matrix starts and moves forward globally. Because any short board in the matrix may hinder the experience.
And all of this is to make mobile phone photography as simple as just pressing the shutter, and complex enough to be comparable to professional photography creation. In vivo’s slogan, they will “continue to provide consumers with a humanized professional imaging experience”.
They did do it to some extent. In 2021, in the second vivo VISION+ mobile phone photography competition, they collected more than 380,000 photographic works, and conveyed countless ordinary people, recorded with the shutter, their emotions and strength.
This is exactly what mobile photography was all about in the first place.