Fresh Hacker News | Computer Vision: Algorithms and Applications, 2nd ed

▲Computer Vision: Algorithms and Applications, 2nd ed(szeliski.org)

76 points by ibobev 3 days ago | 5 comments

▲dimatura 1 hour ago

This is a great book - learned a lot from the first edition back in the day, and got the second edition as soon as it came out. It's always fun to just leaf through a random chapter.

▲aanet 2 hours ago

Seen this post on HN so many times..

Would love to see / hear if there are any undergrad/grad-level courses that follow this book (or others) that cover computer vision - from basic-to-advanced.

Thanks!

▲bonoboTP 1 hour ago

It's right there on the linked website under "Slide sets and lectures".

▲aanet 1 hour ago

Thanks

I must be blind

▲swader999 54 minutes ago

This is the right area for you to be in at least.

▲krapht 4 hours ago

An excellent book for fundamentals. Still haven't found a good textbook that covers the next level, that takes you from a student to competent practitioner. Advanced knowledge that I've picked up in this field has been from coworkers, painfully gained experience, and reading Kaggle writeups.

▲bonoboTP 4 hours ago

It gets specialized after that. You need to be more specific about the area you are interested in. Computer vision is a very broad field. For newer topics, there are often no textbooks yet because it takes time to write books and the methods and practices change quite fast, so it takes time to stand the test of time. Your best bet is arXiv and GitHub to learn the latest things.

Object detection / segmentation, human pose (2D/3D), 3D human motion tracking and modeling, multi-object tracking, re-identification and metric learning, action recognition, OCR, handwriting, face and biometrics, open-vocabulary recognition, 3D geometry and vision-language-action models, autonomous driving, epipolar geometry, triangulation, SLAM, PnP, bundle adjustment, structure-from-motion, 3D reconstruction (meshes, NeRFs, Gaussian splatting, point clouds), depth/normal/optical flow estimation, 3D scene flow, recovering material properties, inverse rendering, differentiable rendering, camera calibration, sensor fusion, IMUs, LiDAR, birds eye view perception. Generative modeling, text-to-image diffusion, video generation and editing, question answering, un- and self-supervised representation learning (contrastive, masked modeling), semi/weak supervision, few-shot and meta-learning, domain adaptation, continual learning, active learning, synthetic data, test-time augmentation strategies, low-level image processing and computational photography, event cameras, denoising, deblurring, super-resolution, frame-interpolation, dehazing, HDR, color calibration, medical imaging, remote sensing, industrial inspection, edge deployment, quantization, distillation, pruning, architecture search, auto-ML, distributed training, inference systems, evaluation/benchmarking, metric design, explainability etc.

You can't put all that into a single generic textbook.

▲greenavocado 4 hours ago

Plus photogrammetric scale recovery, rolling-shutter & generic-camera (fisheye, catadioptric) geometry, vanishing-point and Manhattan-world estimation, non-rigid / template-based SfM, reflectance/illumination modelling (photometric stereo, BRDF/BTDF, inverse rendering beyond NeRF), polarisation, hyperspectral, fluorescence, X-ray/CT/microscopy, active structured-light, ToF waveform decoding, coded-aperture lensless imaging, shape-from-defocus, transparency & glass segmentation, layout/affordance/physics prediction, crowd & group activity, hand/eye/gaze performance capture, sign-language, document structure & vectorisation charts, font/writer identification, 2-D/3-D primitive fitting, robust RANSAC variants, photometric corrections (rolling-shutter rectification, radial distortion, HDR glare, hot-pixel mapping), adversarial/corruption robustness, fairness auditing, on-device streaming perception and learned codecs, formal verification for safety-critical vision, plus reproducibility protocols and statistical methods for benchmarks

▲thenobsta 1 hour ago

It's astounding how much there is to this field.

▲brcmthrowaway 51 minutes ago

Any updates using AI? One shot camera calibration?

▲lacoolj 4 hours ago

This is great, but why is it posted here like it's new? This is from 2022

▲pthreads 1 hour ago

It is a good thing that links to useful resources like these are reposted every now and then. For many, like myself, this could be the first time seeing it. Perhaps a date tag would add some clarity for those who have already see it.

▲JohnKemeny 3 hours ago

There's even a HN post from almost exactly 5 years ago:

Computer Vision: Algorithms and Applications, 2nd ed (szeliski.org)

0 comments

https://news.ycombinator.com/item?id=24945823

But anyway; why not? Yes, add (2020) to the title, by all means.