Sunday, September 3, 2017

Python Tips

1. Not to truncate the pandas dataset while printing.
        pd.options.display.max_rows = -1
 (-1 will print entire dataset or give number like 4000 to print 4000 lines in dataset)
2. Converting string to datatime object
    from datetime import datetime
    date_string = "2012-08-12"
    current_date = datetime.strptime(date_string, "%Y-%m-%d")

3. Find delta days between 2 dates
  delta = abs(current_date - old date)
  delta.days -> give total days


4. Find min, max & mean(average) in dictionary key value pair.

Ex :

    print d

    min_len = min(d.items(), key=lambda x: x[1])[1]
    max_len = max(d.items(), key=lambda x: x[1])[1]
    avg_len = float(sum(d.values()))/len(d)

    print min_len, max_len, avg_len

Output :
              {19L: 4, 24581L: 633}
               4 633 318.5

5. Merging dataframes and filling NaN values.

    merged_df = pd.merge(df_1, df_1, on='key_name_to_be_given', how='outer')
    merged_df.fillna(0, inplace=True)

    outer implies union(default is inner -> intersection)


6. To find if string contains one of multiple substring in dataframe row element and return those rows which has one amount those.
   bool_df  = df["column_name"].str.contains("DIAG|LAB|DRUG")
   final_df = df[bool_df == True]

Here, df is panda dataframe, and it search/finds rows which has substring DIAG or LAB or DRUG
bool_df has True or False.
final_df will have only rows which had substring DIAG or LAB or DRUG.


7. Grouping elements in dataframe
    group = df.groupby(["column_name_1", "column_name_2"])
    for p in group.groups:
        print p, " has ", len(group.groups[p]), " entries"

Can add single or multiple column names.



8. Creating new panda dataframe with filtered columns,
Assume : old dataframe has A, B, C, D, E columns.
 
new = old.filter(['A','B','D'], axis=1)
 
Now, new will have A, B, D columns.


9. Converting datatime object to string

Ex:
print datatime_obj
print datatime_obj.strftime('%Y-%m-%d')

2012-12-31 00:00:00

2012-12-31

10. Appending rows to the dataframe.

Given dead_encounter is dictionary of key, value pair.
Create empty dataframe and then add the value from dead_encounter to dataframe df

    df = pd.DataFrame(columns=('patient_id', 'indx_date'))
    i = 0
    for key in dead_encounter:
         df.loc[i] = (key, dead_encounter[key].strftime('%Y-%m-%d'))
         i = i+1

11. Reading from csv file
import pandas as pd
events = pd.read_csv(filepath + 'events.csv')

12. Writing to csv file
dataframe.to_csv(file_path + 'filename.csv', columns=['col_1', 'col_2'], index=False)

13. Appending to csv file
dataframe.to_csv(file_path + 'filename.csv', columns=['col_1', 'col_2'], index=False,mode='a', header=False)


14. from dataframe with having datatime column sub 30 days.

from datetime import datetime, timedelta

    df = pd.read_csv(filepath + 'events.csv')
    print df
    df['indx_date'] = pd.to_datetime(df['indx_date'])
    df['indx_date'] = df_deceased['indx_date']-timedelta(days=30)
    print df

output:
     patient_id   indx_date
0        8193.0  2012-12-31
1       24579.0  2015-08-07

     patient_id  indx_date
0        8193.0 2012-12-01
1       24579.0 2015-07-08

15. Remove complete row if value in particular column is NaN

df = df[pd.notnull(df["column_name"])]

16. Open and write line to file.

    f = open(deliverables_path + 'file_name.csv', 'w')
    f.write("name_id,feature_id,feature_value\n")  # python will convert \n to os.linesep
    f.close()

17. column in pandas:

>>df = df.drop('column_name', 1)

0 for rows and 1 for columns.

In place drop without reassign df:

>>df.drop('column_name', axis=1, inplace=True)

To drop by column number instead of by column name:

df.drop(df.columns[[0, 1, 3]], axis=1)

deletes 1st, 2nd & 4th column.


18. change particular value in dataframe.
df.set_value(index, "column_name", value_to_update)

index -> is row index (can specify even name but should be key)

19. Convert dataframe to dictonary.

df_dict = df.set_index('column_id').T.to_dict('list')
output ex :
       column_id  feature_id  feature_value
0         8193.0   3171.0         0.039215686274509803
{8193.0: [3171.0, 0.039215686274509803]}

df_dict = df.set_index('column_id').T.to_dict('records')
output ex :
       column_id  feature_id
0         8193.0   3171.0      
{8193.0: 3171.0}



https://pandas.pydata.org/pandas-docs/version/0.18.1/generated/pandas.DataFrame.to_dict.html

20. Sorting list inside list with given index,




from operator import itemgetter

print list

list = sorted(list[key], key=itemgetter(0))
print list

Output:
 [(2741.0, 1.0), (2751.0, 1.0), (2760.0, 1.0), (2841.0, 1.0), (2880.0, 1.0), (2914.0, 1.0), (2948.0, 1.0), (3008.0, 1.0), (3049.0, 1.0), (1193.0, 1.0), (1340.0, 1.0), (1658.0, 1.0), (1723.0, 1.0), (2341.0, 1.0), (2414.0, 1.0)]

 [(1193.0, 1.0), (1340.0, 1.0), (1658.0, 1.0), (1723.0, 1.0), (2341.0, 1.0), (2414.0, 1.0), (2741.0, 1.0), (2751.0, 1.0), (2760.0, 1.0), (2841.0, 1.0), (2880.0, 1.0), (2914.0, 1.0), (2948.0, 1.0), (3008.0, 1.0), (3049.0, 1.0)]






Sunday, June 4, 2017

Network Security


Kali Linux a Debian-derived Linux distribution designed for digital forensics and penetration testing.
You can download this and start to get handon from : https://www.kali.org/

Download the intentionally vulnerable linux from : https://information.rapid7.com/metasploitable-download.html
Metasploitable will help you to get hands on for knowing and understanding how and exploit can be exploited from outside like using tools from Kali Linux.


Exploit database : https://www.rapid7.com/db/modules/

Cyber crimes


Avoid BlackHat SEO to be on good pages with Search Engines

https://unamo.com/blog/seo/8-risky-black-hat-seo-techniques-used-today
Cloaking : Present different content to user and to bots(search engine spiders).
Doorway pages : A page that lists many keywords and hope of increasing search engine ranking. There will be scripts on pages that will be redirected to attackers page. SEO where the page is optimized to be visited with giving keywords but when user enters the page it will be very less relevant to given keywords.

BOTNET C & C(command and control):
1. IRC(internet relay chat) Channels
2. P2P botnets.
3. Fast Flux DNS.
4. Random DNS generation.


Top three countries where spam directed visitors added items to their shopping cart.
1. United States
2. Canada
3. Philippines
Ref : https://cs.gmu.edu/~mccoy/papers/purchasepair-usesec11.pdf


Scamming ain't easy as one should find and pay extra for their service for,
1. Shady domain name providers.
2. Finding bulletproof DNS provides.
3. Finding bulletproof web server providers.
Indeed one has to have resilient hosting with distributed web-server, domain randomization & DNS fast flux enabled.

Penetration testing

Helps to evaluate:
1. Procedural : e.g. incident response processes, management oversight.
2. Operational
3. Technological

Benefits are,
1. Clear understanding of security of network.
2. Discovery of any vulnerabilities.
3. Demonstration of any Threats that could happen.

Scope of penetration testing is not just technical and cyber operations but also social engineering and gaining access to organization physical assets

Methodologies.
1. Footprinting : whois, nslookup
2. Scanning : nmap, fping, TCP/UDP superscan, OS detection queso
3. Enumeration : dumpACL (dumpSec), showmount legion, rpcinfo, list user accounts by sid2usre, list file shares by tool legion, identify application by rpcinfo & telnet or netcat.
4. Gaining Access : password eavesdropping by tcpdump/ssldump & L0phtcrack, fileshare bruteforcing by NAT legion, password file grap by pwddump2 & tftp(trivial file transfer program),
5. Escalating privileges : L0phtcrack, John the ripper(free pwd craking tool), getadmin and sechole exploits. To increase the privilege from normal user access gain in above set to root/admin access.
 6. Pilfering : Ones one get privileged access to system then he can steal information from system which may help to gain further access to other trusted systems which trusts the system which has been compromised or get access to any data.
Tools : rhosts, LSA secrets.
7. Covering tracks : To avoid detection from getting tracked and blacklisted by administrator this step is performed. Tools ZAP, Event log GUI are used to edit/clear system logs, or using hiding tools like rootkits for hiding malware.
8. Creating backdoor : Gaining first time access to new system is hard ones that is gained one can create trapdoors/backdoors for subsequent access to be easy. Ex : like creation of rouge user accounts, or place remote access utilities. Tools like remote desktop, netcat, remote.exe, vnc, bo2k. Replaces apps with Trojans, edit registry key, fpnwcint.dll.


Social Engineering is most cheap and cost effective way of getting in to the system in a network as the attacker need not have specialized tools and technical knowledge.
RSA explanation of how its security was breached by social engineering method.
https://www.theregister.co.uk/2011/04/04/rsa_hack_howdunnit/
One email caused a loss of $66million to company.

Command social engineering techniques involve:
  1. Impersonating
  2. Help desk
  3. Third party authorization.
  4. Tailgating.
  5. Snail mails.
  6. Tech support.
Computer based techniques:
  1. Pop-up windows.
  2. Email attachements.
  3. Websites.
  4. Email scams.
  5. Instant messaging & IRC(Internet Relay Chat).