UoS Buildings Image Dataset for Computer Vision Algorithms
The dataset for this project is represented by photos, photos for the buildings of the University of Salford, these photos are taken by a mobile phone camera from different angels and different distances , even though this task sounds so easy but it encountered some challenges, these challenges are summarized below:
a. Fixed or unremovable objects.
When taking several photos for a building or a landscape from different angels and directions ,there are some of these angels blocked by a form of a fixed object such as trees and plants, light poles, signs, statues, cabins, bicycle shades, scooter stands, generators/transformers, construction barriers, construction equipment and any other service equipment so it is unavoidable to represent some photos without these objects included, this will raise 3 questions.
- will these objects confuse the model/application we intend to create meaning will that obstacle prevent the model/application from identifying the designated building?
- Or will the photos be more precise with these objects and provide the capability for the model/application to identify these building with these obstacles included?
- How far is the maximum length for detection? In other words, how far will the mobile device with the application be from the building so it could or could not detect the designated building?
b. Removable and moving objects.
- Any University is crowded with staff and students especially in the rush hours of the day so it is hard for some photos to be taken without a personnel appearing in that photo in a certain time period of the day.
But, due to privacy issues and showing respect to that person, these photos are better excluded.
- Parked vehicles, trollies and service equipment can be an obstacle and might appear in these images as well as it can block access to some areas which an image from a certain angel cannot be obtained.
- Animals, like dogs, cats, birds or even squirrels cannot be avoided in some photos which are entitled to the same questions above.
In a deep learning project, more data means more accuracy and less error, at this stage of our project it was agreed to have 50 photos per building but we can increase the number of photos for more accurate results but due to the limitation of time for this project it was agreed for 50 per building only.
these photos were taken on cloudy days and to expand our work on this project (as future works and recommendations).
Photos on sunny, rainy, foggy, snowy and any other weather condition days can be included.
Even photos in different times of the day can be included such as night, dawn, and sunset times. To provide our designated model with all the possibilities to identify these buildings in all available circumstances.
3. The selected buildings.
It was agreed to select 10 buildings only from the University of Salford buildings for this project with at least 50 images per building, these selected building for this project with the number of images taken are:
1. Chapman: 74 images
2. Clifford Whitworth Library: 60 images
3. Cockcroft: 67 images
4. Maxwell: 80 images
5. Media City Campus: 92 images
6. New Adelphi: 93 images
7. New Science, Engineering & Environment: 78 images
8. Newton: 92 images
9. Sports Centre: 55 images
10. University House: 60 images
Peel building is an important figure of the University of Salford due to its distinct and amazing exterior design but unfortunately it was excluded from the selection due to some maintenance activities at the time of collecting the photos for this project as it is partially covered with scaffolding and a lot of movement by personnel and equipment.
If the supervisor suggests that this will be another challenge to include in the project then, it is mandatory to collect its photos.
There are many other buildings in the University of Salford and again to expand our project in the future, we can include all the buildings of the University of Salford.
The full list of buildings of the university can be reviewed by accessing an interactive map on: www.salford.ac.uk/find-us
4. Expand Further.
This project can be improved furthermore with so many capabilities, again due to the limitation of time given to this project , these improvements can be implemented later as future works.
In simple words, this project is to create an application that can display the building’s name when pointing a mobile device with a camera to that building.
Future featured to be added:
a. Address/ location: this will require collection of additional data which is the longitude and latitude of each building included or the post code which will be the same taking under consideration how close these buildings appear on the interactive map application such as Google maps, Google earth or iMaps.
b. Description of the building: what is the building for, by which school is this building occupied? and what facilities are included in this building?
c. Interior Images: all the photos at this stage were taken for the exterior of the buildings, will interior photos make an impact on the model/application for example, if the user is inside newton or chapman and opens the application, will the building be identified especially the interior of these buildings have a high level of similarity for the corridors, rooms, halls, and labs? Will the furniture and assets will be as obstacles or identification marks?
d. Directions to a specific area/floor inside the building: if the interior images succeed with the model/application, it would be a good idea adding a search option to the model/application so it can guide the user to a specific area showing directions to that area, for example if the user is inside newton building and searches for lab 141 it will direct him to the first floor of the building with an interactive arrow that changes while the user is approaching his destination.
Or, if the application can identify the building from its interior, a drop down list will be activated with each floor of this building, for example, if the model/application identifies Newton building, the drop down list will be activated and when pressing on that drop down list it will represent interactive tabs for each floor of the building, selecting one of the floors by clicking on its tab will display the facilities on that floor for example if the user presses on floor 1 tab, another screen will appear displaying which facilities are on that floor.
Furthermore, if the model/application identifies another building, it should activate a different number of floors as buildings differ in the number of floors from each other.
this feature can be improved with a voice assistant that can direct the user after he applies his search (something similar to the voice assistant in Google maps but applied to the interior of the university’s buildings.
e. Top View: if a drone with a camera can be afforded, it can provide arial images and top views for the buildings that can be added to the model/application but these images can be similar to the interior images situation , the buildings can be similar to each other from the top with other obstacles included like water tanks and AC units.
5. Other Questions:
- Will the model/application be reproducible? the presumed answer for this question should be YES, IF, the model/application will be fed with the proper data (images) such as images of restaurants, schools, supermarkets, hospitals, government facilities...etc.