|
|
|
Grab websites and -pages
Besides keywords to perfrom a search you can also type the URL of a website or webpage in the search-bar. You will have 3 possibilities:
- Get all the images of the web site
- Get all the images of the web page
- As a spider follow all links on the page, and the pages it links to and collect all images found on the way.
Each of the three can be very powerfull. Only you have to find the right sites to grab, also for that GrabJPG will provide a solution as we discuss further on.
Let's first type a URL in the search bar and click the GO button. A good URL to start with might be www.nasa.gov, the NASA.
|
|
|
|
|
A screen pops up where first you are asked to choose if you want to grab the site, the page or spider all links. We are interested in the images from Nasa, not only the first page, and this is not the right kind page for spidering all links. Choose the first one, and press Next
The choice for the entire site is the most common choice. Grabbing only one page can be desirable if you found a site has one or a few pages containing images that you want.
|
|
The third option, "Grab all images everywhere this page links to" is the most advanced of the three. You'll be presented two extra options:
Follow link
: Here you can choose out of three:
- Follow only links that stay on this site
- Follow only links that go out of this site
- Follow all links
So you can scan only a portion of a certain site, use a page on a site that gives relevant links but ignore the site or pretty much go anywhere.
The second option makes that directories like yahoo are a good starting point for an advanced grab. Find a page there that links to sites related to the subject you want to find images for and let it follow all links not on the site. (Otherwise you would also scan the entire yahoo which can take weeks)
Maximum link depth
: If your page links to 15 other pages that also link to 15 other pages etc you will already have 11 million pages to scan when you are 5 levels deep. That will never finish and soon also reach a lot of not really related pages. Most pages link to much more than 15 other pages. Maximum link depth helps you to limit how deep GrabJPG goes.
If you are new to GrabJPG it is perhaps best to ignore this third option and first explore a bit more of the rest.
Configure and go
Press Next to go to the next screen.
|
Press the Finish button and the grab will start.
|
Grab properties
Minimum image size
Images have a size expressed in the widht and height of that image in pixels. Usually you want to ignore the smallest ones, often they are either thumbnails of a larger image that you will also get, site lay out images or advertisements. By specifying a minimum you filter out those images. Another usage of the minimum image size is making sure you get images of good quality only.
Minimum file size
The image is contained in a JPG file. This file has a certain size in KB. In general, the larger the size the larger the quality of the image. If you wish you can filter out the smaller ones.
Download folder
Later when images are found you might want to save images. To avoid that you are prompted every time for a file name and folder there is a default download folder. All images you save will be saved in this folder, without prompting.
Naming convention
As we are not prompting for a file name in the standard save option, GrabJPG will make a filename itself. You can choose how it does that, by default the original file name is kept. If the file already exists an extra identifier is added to make it unique.
|
|
|
|
|
|
GrabJPG grabs a website by following all links it can find, on that site or elsewere referring to that site. Some sites have protected areas, only for members. Often they leave a side door open, they do so to score high rankings in search engines.
GrabJPG goes into open side doors (and there are a lot of them) but does not perform any actions to hack or cheat itself into a site. If a door is open, we go in, otherwise not.
|
|
|
|
|
|
|
|
The grab page
At the left side of the grab page you see the properties that you defined when you started your grab. You can change those properties by pressing the Edit this grab button below them.
One thing was not in the properties, the Block field. This are sites or parts of sites that you blocked in the
info view
.
At the right side you see a gallery of images. The grab is running while you already see the first results and images are added directly when found. You can view the collection of images on three ways, gallery view, list view and info view.
Viewing your collection of images
Any collection of images in GrabJPG has the three view buttons to view the collection as a gallery, a list of view the info view. At each you can view, browse and manipulate the collection of images. The most common used view is the gallery view. Each view is described in detail at it's own page in this manual.
Gallery view
List view
Info view
Full screen
You can open an image full screen by double clicking it.
Full screen viewer
Examples
To help you on your way finding nice sites to grab we have collected a few series of
suggestions and examples.
|
|
|
|