Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Firewall demo hidden release INTER-378 #111

Merged
merged 49 commits into from
Jan 12, 2024
Merged
Show file tree
Hide file tree
Changes from 47 commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
21b55b2
feat: save bot IP to database
JuroUhlar Dec 13, 2023
b05d23b
feat: create new use case page
JuroUhlar Dec 13, 2023
4bde861
feat: display IPs table
JuroUhlar Dec 14, 2023
2e871a9
feat: simple rule creation
JuroUhlar Dec 14, 2023
bd1b721
feat: save blocked ip
JuroUhlar Dec 14, 2023
3476250
feat: save blocked ip mutation
JuroUhlar Dec 14, 2023
9542d33
feat: save blocked ips, unblock ips
JuroUhlar Dec 14, 2023
ba7a310
feat: basic functionality works
JuroUhlar Dec 18, 2023
7f59375
feat: limit how many IPs you can block
JuroUhlar Dec 18, 2023
7eb7323
chore: clean up
JuroUhlar Dec 19, 2023
2137597
feat: add block bot ip checks
JuroUhlar Dec 19, 2023
3aa019c
feat: only block your own address
JuroUhlar Dec 20, 2023
3fda0c4
feat: error handling, loading states
JuroUhlar Dec 20, 2023
c566cbf
chore: self-review fixes
JuroUhlar Dec 20, 2023
f09f552
feat: use all 5 available firewall rules
JuroUhlar Dec 21, 2023
25c0748
chore: refactor and clean up
JuroUhlar Dec 21, 2023
78371e8
chore: refactor and clean up
JuroUhlar Dec 21, 2023
a34efb3
chore: unit tests for buildFirewall rules
JuroUhlar Dec 21, 2023
7d223c5
feat: endpoint for deleting old IPs
JuroUhlar Dec 22, 2023
7e22627
feat: job for deleting old IPs
JuroUhlar Dec 22, 2023
33ccdc4
chore: clean up. try node-cron
JuroUhlar Dec 30, 2023
4b0d3e3
feat: delete old ips in cron job
JuroUhlar Jan 2, 2024
74dcab8
chore: refactor to fix test error
JuroUhlar Jan 2, 2024
ebad394
chore: fix lint error
JuroUhlar Jan 2, 2024
9f099ab
chore: split use cases in menu evenly
JuroUhlar Jan 2, 2024
9dd2f99
chore: self-review clean up
JuroUhlar Jan 2, 2024
9018756
chore: self review clean up
JuroUhlar Jan 2, 2024
6b07e0a
Merge branch 'main' into feat/firewall-demo
JuroUhlar Jan 2, 2024
a28bc5e
chore: add missing paywall link
JuroUhlar Jan 2, 2024
3181c0b
chore: clean up
JuroUhlar Jan 2, 2024
dc3b865
chore: clean up, refactor, fix eslint
JuroUhlar Jan 2, 2024
52b0e16
chore: reset blocked IPs on e2e reset
JuroUhlar Jan 3, 2024
b39addd
chore: add e2e test
JuroUhlar Jan 3, 2024
a68662d
chore: move copy to different folder
JuroUhlar Jan 3, 2024
d03144c
chore: add e2e secrets
JuroUhlar Jan 3, 2024
6b0f818
test: add reload
JuroUhlar Jan 3, 2024
e53107c
test: try removing network idle to fix flakiness
JuroUhlar Jan 3, 2024
476f741
test: try using status code to fix flakiness
JuroUhlar Jan 3, 2024
dd6c2e5
chore: try increasing timeout
JuroUhlar Jan 3, 2024
2b966ca
chore: try using a second page
JuroUhlar Jan 3, 2024
bfe09ef
chore: try running the test in Chrome only
JuroUhlar Jan 3, 2024
7c03435
chore: run Chrome only tests in first shard only
JuroUhlar Jan 3, 2024
b12f4f7
chore: clean up
JuroUhlar Jan 3, 2024
bff1cd4
chore: fix copy
JuroUhlar Jan 4, 2024
f913b5d
chore: console.log -> console.error
JuroUhlar Jan 4, 2024
513a572
chore: console.error for errors
JuroUhlar Jan 4, 2024
9556ad3
review fixes: extract queries into hooks
JuroUhlar Jan 4, 2024
ab90eef
review fix: unwrap if else statements
JuroUhlar Jan 5, 2024
6f0ec1b
review fix: clarify buildFirewallRules test
JuroUhlar Jan 5, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 12 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,13 @@ on:
pull_request:
branches: [main]
env:
# Playwright headless browsers running in CI get low confidence scores, causing flaky tests. Lower the confidence score threshold for CI testing.
# Playwright headless browsers running in CI get low confidence scores, causing flaky e2e tests. Lower the confidence score threshold for CI testing.
MIN_CONFIDENCE_SCORE: 0
# Staging Cloudflare credentials and IDs for e2e tests
CLOUDFLARE_API_TOKEN: '${{ secrets.CLOUDFLARE_API_TOKEN }}'
CLOUDFLARE_ZONE_ID: '${{ secrets.CLOUDFLARE_ZONE_ID }}'
CLOUDFLARE_RULESET_ID: '${{ secrets.CLOUDFLARE_RULESET_ID }}'

jobs:
lint:
name: Lint
Expand Down Expand Up @@ -111,7 +116,12 @@ jobs:
run: yarn build

- name: Run Playwright tests
run: npx playwright test --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}
run: npx playwright test --grep-invert CHROME_ONLY --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}

# Some tests are only run on Chrome, marked with CHROME_ONLY in their name
- name: Run Chrome-only Playwright tests
run: npx playwright test --grep CHROME_ONLY --project='chromium'
if: matrix.shardIndex == 1
TheUnderScorer marked this conversation as resolved.
Show resolved Hide resolved

- name: Upload Playwright report
uses: actions/upload-artifact@v3
Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -56,3 +56,6 @@ tsconfig.tsbuildinfo

# MacOS Finder
.DS_Store

# Local experiments
.scratchpad/*
44 changes: 44 additions & 0 deletions cron-jobs/delete_expired_ip_rules.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
import { BlockedIpDbModel } from '../src/server/botd-firewall/blockedIpsDatabase';
import { Op } from 'sequelize';
import { syncFirewallRuleset } from '../src/server/botd-firewall/cloudflareApiHelper';
import { schedule } from 'node-cron';
import { HOUR_MS } from '../src/shared/timeUtils';
import 'dotenv/config';

/**
* In production, run this file in conjunction with the production web server like:
* yarn start:with-cron-jobs
*/

// Every 5 minutes
schedule('*/5 * * * *', () => {
deleteOldIpBlocks();
});

const IP_BLOCK_TIME_TO_LIVE_MS = HOUR_MS;

async function deleteOldIpBlocks() {
try {
// Remove expired IP blocks
const deletedCount = await BlockedIpDbModel.destroy({
where: {
timestamp: {
[Op.lt]: new Date(Date.now() - IP_BLOCK_TIME_TO_LIVE_MS).toISOString(),
},
},
});

console.log(`Deleted ${deletedCount} expired blocked IPs from the database.`);

/**
* Construct updated firewall rules from the blocked IP database and apply them to the Cloudflare application.
* Note: We do this even if no IPs were deleted:
* A user might have blocked their IP but the database might have been cleared during site deployment right after,
* potentially leaving the IP blocked beyond the desired TTL. Safer to sync the firewall ruleset every time.
*/
await syncFirewallRuleset();
console.log(`Updated Cloudflare firewall.`);
} catch (error) {
console.error(`Error deleting old blocked IPs: ${error}`);
}
}
48 changes: 48 additions & 0 deletions e2e/bot-firewall.spec.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
import { expect, test } from '@playwright/test';
import { resetScenarios } from './resetHelper';
import { TEST_IDS } from '../src/client/testIDs';
import { BOT_FIREWALL_COPY } from '../src/client/bot-firewall/botFirewallCopy';

/**
* CHROME_ONLY flag tells the GitHub action to run this test only using Chrome.
* This test relies on a single common Cloudflare ruleset, we we cannot run multiple instances of it at the same time.
*/
test.describe('Bot Firewall Demo CHROME_ONLY', () => {
test.beforeEach(async ({ page }) => {
await page.goto('/coupon-fraud');
await resetScenarios(page);
});

test('Should display bot visit and allow blocking/unblocking its IP address', async ({ page, context }) => {
// Record bot visit in web-scraping page
await page.goto('/web-scraping');
await expect(page.getByTestId(TEST_IDS.common.alert)).toContainText('Malicious bot detected');

// Check bot visit record and block IP
await page.goto('/bot-firewall');
await page.getByRole('button', { name: BOT_FIREWALL_COPY.blockIp }).first().click();
await page.getByText('was blocked in the application firewall').waitFor();
await page.waitForTimeout(3000);

/**
* Try to visit web-scraping page, should be blocked by Cloudflare
* Checking the response code here as parsing the actual page if flaky for some reason.
* Using a separate tab also seems to help with flakiness.
*/
const secondTab = await context.newPage();
await secondTab.goto('https://staging.fingerprinthub.com/web-scraping');
await secondTab.reload();
await secondTab.getByRole('heading', { name: 'Sorry, you have been blocked' }).waitFor();

// Unblock IP
await page.goto('/bot-firewall');
await page.getByRole('button', { name: BOT_FIREWALL_COPY.unblockIp }).first().click();
await page.getByText('was unblocked in the application firewall').waitFor();
await page.waitForTimeout(3000);

// Try to visit web-scraping page, should be allowed again
await secondTab.goto('https://staging.fingerprinthub.com/web-scraping');
await secondTab.reload();
await expect(secondTab.getByTestId(TEST_IDS.common.alert)).toContainText('Malicious bot detected');
});
});
3 changes: 1 addition & 2 deletions e2e/scraping/protected.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,7 @@ import { TEST_IDS } from '../../src/client/testIDs';

test.describe('Scraping flights', () => {
test('is not possible with Bot detection on', async ({ page }) => {
await page.goto('/web-scraping');
await page.waitForLoadState('networkidle');
await page.goto('/web-scraping', { waitUntil: 'networkidle' });
await expect(page.getByTestId(TEST_IDS.common.alert)).toContainText('Malicious bot detected');
});
});
5 changes: 1 addition & 4 deletions e2e/scraping/unprotected.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,11 @@ const scrapeText = async (parent: Locator, testId: string) => {

test.describe('Scraping flights', () => {
test('is possible with Bot detection off', async ({ page }) => {
await page.goto('/web-scraping?disableBotDetection=1');
await page.waitForLoadState('networkidle');
await page.goto('/web-scraping?disableBotDetection=1', { waitUntil: 'networkidle' });
// Artificial wait necessary to prevent flakiness
await page.waitForTimeout(3000);

const flightCards = await page.getByTestId(TEST_ID.card).all();
console.log('Found flight cards: ', flightCards.length);
expect(flightCards.length > 0).toBe(true);

const flightData = [];
Expand All @@ -34,6 +32,5 @@ test.describe('Scraping flights', () => {

expect(flightData.length > 0).toBe(true);
writeFileSync('./e2e/output/flightData.json', JSON.stringify(flightData, null, 2));
console.log("Scraped flight data saved to 'e2e/output/flightData.json'");
});
});
4 changes: 2 additions & 2 deletions e2e/zodUtils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ export function isAgentResponse(obj: unknown): boolean {
agentResponseSchema.parse(obj);
return true;
} catch (error) {
console.log(error);
console.error(error);
return false;
}
}
Expand Down Expand Up @@ -102,7 +102,7 @@ export function isServerResponse(obj: unknown): boolean {
serverResponseSchema.parse(obj);
return true;
} catch (error) {
console.log(error);
console.error(error);
return false;
}
}
1 change: 0 additions & 1 deletion next.config.js
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ const path = require('path');
**/
module.exports = {
images: {
domains: ['images.unsplash.com', 'localhost'],
formats: ['image/webp'],
},
sassOptions: {
Expand Down
13 changes: 11 additions & 2 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
"dev": "next dev",
"build": "next build",
"start": "next start",
"start:with-cron-jobs": "run-p start cron",
TheUnderScorer marked this conversation as resolved.
Show resolved Hide resolved
"cron": "tsx cron-jobs/delete_expired_ip_rules.ts",
JuroUhlar marked this conversation as resolved.
Show resolved Hide resolved
"lint": "next lint",
"lint:fix": "yarn lint --fix",
"prettier": "prettier src --check",
Expand Down Expand Up @@ -36,6 +38,7 @@
"classnames": "^2.3.2",
"framer-motion": "^10.13.2",
"include-media": "^2.0.0",
"is-ip": "^5.0.1",
"leaflet": "^1.9.4",
"next": "^14.0.3",
"next-usequerystate": "^1.9.1",
Expand All @@ -46,7 +49,8 @@
"react-query": "^3.39.1",
"react-syntax-highlighter": "^15.5.0",
"react-use": "^17.4.0",
"sequelize": "^6.19.0",
"sequelize": "^6.35.2",
"sharp": "^0.33.1",
"socket.io": "^4.5.4",
"socket.io-client": "^4.5.4",
"sqlite3": "^5.0.8",
Expand All @@ -57,18 +61,23 @@
"@playwright/test": "^1.40.1",
"@types/leaflet": "^1.9.3",
"@types/node": "^18.11.18",
"@types/node-cron": "^3.0.11",
"@types/react": "^18.0.27",
"@typescript-eslint/eslint-plugin": "^6.13.2",
"@vitejs/plugin-react": "^3.0.1",
"dotenv": "^16.3.1",
"eslint": "^8.55.0",
"eslint": "^8.56.0",
"eslint-config-next": "^14.0.4",
"eslint-config-prettier": "^9.1.0",
"eslint-plugin-jsx-a11y": "^6.8.0",
"eslint-plugin-prettier": "^5.0.1",
"eslint-plugin-react-hooks": "^4.6.0",
"jsdom": "^21.1.0",
"node-cron": "^3.0.3",
"npm-run-all": "^4.1.5",
"prettier": "^3.1.0",
"sass": "^1.64.1",
"tsx": "^4.7.0",
"typescript": "^4.9.5",
"vitest": "^0.28.3"
}
Expand Down
4 changes: 4 additions & 0 deletions src/client/bot-firewall/botFirewallCopy.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
export const BOT_FIREWALL_COPY = {
blockIp: 'Block this IP',
unblockIp: 'Unblock',
} as const;
4 changes: 2 additions & 2 deletions src/client/components/common/Header/Header.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -80,11 +80,11 @@ export default function Header({ notificationBar, darkMode }: HeaderProps) {
darkMode,
leftColumns: [
{
list: USE_CASES_NAVIGATION.slice(0, 3),
list: USE_CASES_NAVIGATION.slice(0, 4),
cardBackground: true,
},
{
list: USE_CASES_NAVIGATION.slice(3),
list: USE_CASES_NAVIGATION.slice(4),
cardBackground: true,
},
],
Expand Down
85 changes: 83 additions & 2 deletions src/client/components/common/content.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ export type UseCase = {
title: string;
url: string;
}[];
hiddenInNavigation?: boolean;
};

export const USE_CASES = {
Expand Down Expand Up @@ -227,6 +228,7 @@ export const USE_CASES = {
title: 'Paywall',
titleMeta: 'Fingerprint Use Cases | Content Paywall Live Demo',
url: '/paywall',
articleUrl: 'https://fingerprint.com/blog/how-paywalls-work-paywall-protection-tutorial/',
iconSvg: PaywallIcon,
descriptionHomepage: [
<p key="1">
Expand Down Expand Up @@ -371,6 +373,80 @@ export const USE_CASES = {
},
],
},
botFirewall: {
title: 'Bot-Detection-powered Firewall',
titleMeta: 'Fingerprint Use Cases | Bot-Detection-powered Firewall',
url: '/bot-firewall',
articleUrl: 'https://fingerprint.com/blog/bot-detection-powered-firewall/',
iconSvg: ScrapingIcon,
descriptionHomepage: [
<p key="1">
Integrate Fingerprint Bot Detection with your Web Application Firewall and dynamically block IP addresses linked
to past bot visits.
</p>,
<p key="2">
Block previously recognized bots on their next visit completely — before they even reach your web page.
</p>,
],
description: (
<>
<p>
Integrate Fingerprint Bot Detection with your Web Application Firewall and dynamically block IP addresses
linked to past bot visits.
</p>
<p>
Fingerprint Bot Detection allows you to identify sophisticated bots and headless browsers by collecting and
analyzing browser signals. See our{' '}
<Link href={'/web-scraping'} target="_blank">
Web scraping demo
</Link>{' '}
for an example of protecting client-site content from bots. This demo goes a step further and uses Bot
detection results to block previously recognized bots on their next visit completely — before they even reach
your web page.
</p>
</>
),
descriptionMeta:
'Integrate Fingerprint Bot Detection with your Web Application Firewall and dynamically block IP addresses linked to past bot visits.',
doNotMentionResetButton: true,
instructions: [
<>
Use a locally running instance of Playwright, Cypress, or another headless browser tool to visit the{' '}
<Link href={'/web-scraping'} target="_blank">
web scraping demo
</Link>
.
</>,
<>
Your headless browser will be recognized as a bot, and your IP address will be saved to the bot visit database
displayed below.
</>,
<>
Click <b>Block this IP</b> to prevent the bot from loading the page at all going forward. For demo purposes, you
are only allowed to block your own IP.
</>,
<>
Try visiting the{' '}
<Link href={'/web-scraping'} target="_blank">
web scraping demo
</Link>{' '}
again (either as a bot or using your regular browser).
</>,
<>Your IP address is blocked from the page completely.</>,
],
moreResources: [
{
url: 'https://fingerprint.com/blog/preventing-content-scraping/',
type: 'Use case tutorial',
title: 'Web Scraping Prevention',
},
{
url: 'https://fingerprint.com/blog/betting-bots/',
type: 'Article',
title: 'Betting Bots',
},
],
},
} as const satisfies Record<string, UseCase>;

export const PLAYGROUND_METADATA: Pick<
Expand All @@ -391,9 +467,14 @@ export const PLAYGROUND_METADATA: Pick<
descriptionMeta: 'Analyze your browser with Fingerprint Pro and see all the available signals.',
};

export const USE_CASES_ARRAY = Object.values(USE_CASES);
export const USE_CASES_ARRAY = Object.values(USE_CASES)
// TODO: Remove this when ready the final of bot firewall demo is ready
.filter((useCase) => useCase.url !== USE_CASES.botFirewall.url);

export const USE_CASES_NAVIGATION = USE_CASES_ARRAY.map((useCase) => ({ title: useCase.title, url: useCase.url }));
export const USE_CASES_NAVIGATION = USE_CASES_ARRAY.map((useCase) => ({
title: useCase.title,
url: useCase.url,
}));
export const PLATFORM_NAVIGATION = [PLAYGROUND_METADATA];

type HomePageCard = Pick<UseCase, 'title' | 'url' | 'iconSvg' | 'descriptionHomepage'>;
Expand Down
Loading
Loading