Pure functions are modelled after functions in mathematics (hence pure), which are, fundamentally, just sets of pairs (with some restrictions). Anything that cannot be thought of as a lookup in some (potentially infinite) collection of pairs is, in this sense, impure.
The definition that pure functions are deterministic and without side effects depends on the meanings of "deterministic" and "side effects", which are usually distinct from what programming languages define. This distinction is useful mainly for one thing, crucial to functional languages: you can seamlessly cache the output of the function for the given input, i.e. memoize the function.
A deterministic function does not really have to be predictable in a sense that you can find out the result without ever calling the function, but once you have it, it will never change unless the arguments change. A function which returns the startup time of your program is deterministic, because it never changes during the run-time of the program (context matters here). A function which returns the current weather at the precise point in time it is called is not deterministic (if your program doesn't happen in an instant), but if the time (and location) is an argument, it is deterministic. Even a true random number generator can be deterministic here if you give it an increasing counter (but it is obliged to return the old value for an unchanged counter). Even reading a file or downloading it from web can be deterministic if you cache the result based on the location. See fn:json-doc
for example.
The second restriction is that the result (including changes in by-ref arguments) should be the only observable effect of the function. This doesn't mean that the function cannot change any state, but such change must be effectively hidden to the outside world. Caching a value must definitely change some state (to store the value), but if you don't have any way of knowing that, the cache behaves as it was pre-filled with the correct values from the beginning of the program. This is also complementary to the first restriction ‒ if a function modifies externally observable state, another function that reads the state can no longer be deterministic. If there is no such function, then it doesn't matter whether the first function modifies something, as it is not observable. By this token, all deterministic functions in the previous paragraph are also side effect-free, with the exception of sending a web request (which could have side effects). This also depends on what you choose as the environment and what counts as observing: a whole program can have side effects, but its individual functions can all be pure; and using reflection usually does not count as breaking the purity of caching functions.
Based on these definitions, there are a lot of functions that are non-deterministic but without side effects: reading from any sort of sensor, reading from a dynamic web page, reading from a port or a changeable file, getting a (pseudo-)random number (if a non-deterministic function has a side effect only on itself, then it's not really a side effect) etc. All of these functions can also be made deterministic if you "bind" them to something, i.e. a counter, time and so on.